Collections
Discover the best community collections!
Collections including paper arxiv:2405.17247
-
What matters when building vision-language models?
Paper • 2405.02246 • Published • 101 -
An Introduction to Vision-Language Modeling
Paper • 2405.17247 • Published • 87 -
DeMamba: AI-Generated Video Detection on Million-Scale GenVideo Benchmark
Paper • 2405.19707 • Published • 7 -
Scaling Up Your Kernels: Large Kernel Design in ConvNets towards Universal Representations
Paper • 2410.08049 • Published • 8
-
MobileCLIP: Fast Image-Text Models through Multi-Modal Reinforced Training
Paper • 2311.17049 • Published • 1 -
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
Paper • 2405.04434 • Published • 14 -
A Study of Autoregressive Decoders for Multi-Tasking in Computer Vision
Paper • 2303.17376 • Published -
Sigmoid Loss for Language Image Pre-Training
Paper • 2303.15343 • Published • 6
-
xLSTM: Extended Long Short-Term Memory
Paper • 2405.04517 • Published • 12 -
You Only Cache Once: Decoder-Decoder Architectures for Language Models
Paper • 2405.05254 • Published • 10 -
Understanding the performance gap between online and offline alignment algorithms
Paper • 2405.08448 • Published • 14 -
Chameleon: Mixed-Modal Early-Fusion Foundation Models
Paper • 2405.09818 • Published • 127
-
Visual Fact Checker: Enabling High-Fidelity Detailed Caption Generation
Paper • 2404.19752 • Published • 22 -
How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites
Paper • 2404.16821 • Published • 55 -
MoAI: Mixture of All Intelligence for Large Language and Vision Models
Paper • 2403.07508 • Published • 74 -
MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training
Paper • 2403.09611 • Published • 125
-
Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing
Paper • 2404.12253 • Published • 54 -
FlowMind: Automatic Workflow Generation with LLMs
Paper • 2404.13050 • Published • 34 -
How Far Can We Go with Practical Function-Level Program Repair?
Paper • 2404.12833 • Published • 6 -
Replacing Judges with Juries: Evaluating LLM Generations with a Panel of Diverse Models
Paper • 2404.18796 • Published • 68