Apollo: An Exploration of Video Understanding in Large Multimodal Models Paper • 2412.10360 • Published 21 days ago • 136
Large Action Models: From Inception to Implementation Paper • 2412.10047 • Published 22 days ago • 31
PIG: Physics-Informed Gaussians as Adaptive Parametric Mesh Representations Paper • 2412.05994 • Published 26 days ago • 17
InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions Paper • 2412.09596 • Published 22 days ago • 92
MAtCha Gaussians: Atlas of Charts for High-Quality Geometry and Photorealism From Sparse Views Paper • 2412.06767 • Published 25 days ago • 6
VisionZip: Longer is Better but Not Necessary in Vision Language Models Paper • 2412.04467 • Published 29 days ago • 105
SNOOPI: Supercharged One-step Diffusion Distillation with Proper Guidance Paper • 2412.02687 • Published Dec 3, 2024 • 108
PaliGemma 2: A Family of Versatile VLMs for Transfer Paper • 2412.03555 • Published about 1 month ago • 119
UniPose: A Unified Multimodal Framework for Human Pose Comprehension, Generation and Editing Paper • 2411.16781 • Published Nov 25, 2024 • 10
AfriMed-QA: A Pan-African, Multi-Specialty, Medical Question-Answering Benchmark Dataset Paper • 2411.15640 • Published Nov 23, 2024 • 4
Star Attention: Efficient LLM Inference over Long Sequences Paper • 2411.17116 • Published Nov 26, 2024 • 47
Material Anything: Generating Materials for Any 3D Object via Diffusion Paper • 2411.15138 • Published Nov 22, 2024 • 42
HtmlRAG: HTML is Better Than Plain Text for Modeling Retrieved Knowledge in RAG Systems Paper • 2411.02959 • Published Nov 5, 2024 • 64
Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss Paper • 2410.17243 • Published Oct 22, 2024 • 89
Can Knowledge Editing Really Correct Hallucinations? Paper • 2410.16251 • Published Oct 21, 2024 • 54
ROCKET-1: Master Open-World Interaction with Visual-Temporal Context Prompting Paper • 2410.17856 • Published Oct 23, 2024 • 49
LLM-based Optimization of Compound AI Systems: A Survey Paper • 2410.16392 • Published Oct 21, 2024 • 14
Aligning Large Language Models via Self-Steering Optimization Paper • 2410.17131 • Published Oct 22, 2024 • 21
Efficiently Democratizing Medical LLMs for 50 Languages via a Mixture of Language Family Experts Paper • 2410.10626 • Published Oct 14, 2024 • 38
DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads Paper • 2410.10819 • Published Oct 14, 2024 • 6