--- library_name: transformers tags: - vidore model-index: - name: colphi3.5 results: [] datasets: - vidore/colpali_train_set base_model: - microsoft/Phi-3.5-vision-instruct pipeline_tag: feature-extraction license: mit --- # ColPhi3.5 This model was trained from scratch on the data_dir/colpali_train_set dataset. ## Model description ColPhi3.5 is a model based on a novel model architecture and training strategy based on Vision Language Models (VLMs) to efficiently index documents from their visual features. It is a Phi3.5-V-4B extension that generates ColBERT- style multi-vector representations of text and images. It was introduced in the paper ColPali: Efficient Document Retrieval with Vision Language Models. ## Intended uses & limitations More information needed ## Training and evaluation data More information needed