SpursgoZmy
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -23,7 +23,7 @@ See the ACL 2024 paper for more details: [Multimodal Table Understanding](https:
|
|
23 |
|
24 |
<!-- Provide a longer summary of what this model is. -->
|
25 |
|
26 |
-
**Model Type:** Table LLaVA strictly follows the [LLaVA-v1.5](https://arxiv.org/abs/2310.03744) model architecture and training pipeline,
|
27 |
with [CLIP-ViT-L-336px](https://huggingface.co/openai/clip-vit-large-patch14-336) as visual encoder (336*336 image resolution),
|
28 |
[Vicuna-v1.5-13B](https://huggingface.co/lmsys/vicuna-13b-v1.5) as base LLM and a two-layer MLP as vision-language connector.
|
29 |
|
|
|
23 |
|
24 |
<!-- Provide a longer summary of what this model is. -->
|
25 |
|
26 |
+
**Model Type:** Table LLaVA 13B strictly follows the [LLaVA-v1.5](https://arxiv.org/abs/2310.03744) model architecture and training pipeline,
|
27 |
with [CLIP-ViT-L-336px](https://huggingface.co/openai/clip-vit-large-patch14-336) as visual encoder (336*336 image resolution),
|
28 |
[Vicuna-v1.5-13B](https://huggingface.co/lmsys/vicuna-13b-v1.5) as base LLM and a two-layer MLP as vision-language connector.
|
29 |
|