Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution Paper • 2409.12191 • Published Sep 18, 2024 • 76
Advancing Referring Expression Segmentation Beyond Single Image Paper • 2305.12452 • Published May 21, 2023
Shikra: Unleashing Multimodal LLM's Referential Dialogue Magic Paper • 2306.15195 • Published Jun 27, 2023
Described Object Detection: Liberating Object Detection with Flexible Expressions Paper • 2307.12813 • Published Jul 24, 2023 • 1
Co-Salient Object Detection with Co-Representation Purification Paper • 2303.07670 • Published Mar 14, 2023