GunaKoppula
/

Llava-Phi2

Visual Question Answering

text-generation

Inference Endpoints

Model card Files Files and versions Community

GunaKoppula commited on Jan 27, 2024

Commit

deeed66

·

verified ·

1 Parent(s): 6837ed1

Update README.md

Files changed (1) hide show

README.md +3 -3

README.md CHANGED Viewed

@@ -17,7 +17,7 @@ This is a multimodal implementation of [Phi2](https://huggingface.co/microsoft/p
 2. Vision Tower: [clip-vit-large-patch14-336](https://huggingface.co/openai/clip-vit-large-patch14-336)
 4. Pretraining Dataset: [LAION-CC-SBU dataset with BLIP captions(200k samples)](https://huggingface.co/datasets/liuhaotian/LLaVA-Pretrain)
 5. Finetuning Dataset: [Instruct 150k dataset based on COCO](https://huggingface.co/datasets/liuhaotian/LLaVA-Instruct-150K)
-6. Finetuned Model: [RaviNaik/Llava-Phi2](https://huggingface.co/RaviNaik/Llava-Phi2)
 ### Model Sources
@@ -46,8 +46,8 @@ pip install -e .
 ```
 3. Run the Model
 ```bash
-python llava_phi/eval/run_llava_phi.py --model-path="RaviNaik/Llava-Phi2" \
-    --image-file="https://huggingface.co/RaviNaik/Llava-Phi2/resolve/main/people.jpg?download=true" \
     --query="How many people are there in the image?"
 ```

 2. Vision Tower: [clip-vit-large-patch14-336](https://huggingface.co/openai/clip-vit-large-patch14-336)
 4. Pretraining Dataset: [LAION-CC-SBU dataset with BLIP captions(200k samples)](https://huggingface.co/datasets/liuhaotian/LLaVA-Pretrain)
 5. Finetuning Dataset: [Instruct 150k dataset based on COCO](https://huggingface.co/datasets/liuhaotian/LLaVA-Instruct-150K)
+6. Finetuned Model: [RaviNaik/Llava-Phi2](https://huggingface.co/GunaKoppula/Llava-Phi2)
 ### Model Sources
 ```
 3. Run the Model
 ```bash
+python llava_phi/eval/run_llava_phi.py --model-path="GunaKoppula/Llava-Phi2" \
+    --image-file="https://huggingface.co/GunaKoppula/Llava-Phi2/resolve/main/people.jpg?download=true" \
     --query="How many people are there in the image?"
 ```