pipeline_tag: zero-shot-image-classification | |
This repository contains the models of the paper [Preserving Multi-Modal Capabilities of Pre-trained VLMs for Improving Vision-Linguistic Compositionality](https://huggingface.co/papers/2410.05210). |