--- license: apache-2.0 datasets: - tattrongvu/vqa_de_en_batch1 - vidore/colpali_train_set language: - en - de base_model: - Qwen/Qwen2-VL-7B-Instruct tags: - vidore - multimodal-embedding --- This is ColQwen2-7b model that was trained for 5 epochs on 8xH100 Cluster with batch size per device is 64. The dataset was extended from the original colpali train set with the gemini 1.5 flash generated QA on 35k images scraped from internet.