SpursgoZmy
/

table-llava-v1.5-13b

Text Generation

Inference Endpoints

Model card Files Files and versions Community

SpursgoZmy commited on Jun 21, 2024

Commit

16d48ca

·

verified ·

1 Parent(s): 78aa4f1

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -23,7 +23,7 @@ See the ACL 2024 paper for more details: [Multimodal Table Understanding](https:
 <!-- Provide a longer summary of what this model is. -->
-**Model Type:** Table LLaVA strictly follows the [LLaVA-v1.5](https://arxiv.org/abs/2310.03744) model architecture and training pipeline,
 with [CLIP-ViT-L-336px](https://huggingface.co/openai/clip-vit-large-patch14-336) as visual encoder (336*336 image resolution),
 [Vicuna-v1.5-13B](https://huggingface.co/lmsys/vicuna-13b-v1.5) as base LLM and a two-layer MLP as vision-language connector.

 <!-- Provide a longer summary of what this model is. -->
+**Model Type:** Table LLaVA 13B strictly follows the [LLaVA-v1.5](https://arxiv.org/abs/2310.03744) model architecture and training pipeline,
 with [CLIP-ViT-L-336px](https://huggingface.co/openai/clip-vit-large-patch14-336) as visual encoder (336*336 image resolution),
 [Vicuna-v1.5-13B](https://huggingface.co/lmsys/vicuna-13b-v1.5) as base LLM and a two-layer MLP as vision-language connector.