VisionArena: 230K Real World User-VLM Conversations with Preference Labels Paper • 2412.08687 • Published 28 days ago • 13
view article Article Are We Ready for Multi-Image Reasoning? Launching VHs: The Visual Haystacks Benchmark! By davidchan • Jul 23, 2024 • 3
CLAIR-A: Leveraging Large Language Models to Judge Audio Captions Paper • 2409.12962 • Published Sep 19, 2024 • 2
SESAME Collection Official checkpoints of the paper -- See, Say, Segment: Teaching LMMs to Overcome False Premises (SESAME), CVPR 2024. • 2 items • Updated Jun 28, 2024 • 1
SESAME Collection Official checkpoints of the paper -- See, Say, Segment: Teaching LMMs to Overcome False Premises (SESAME), CVPR 2024. • 2 items • Updated Jun 28, 2024 • 1