Are Vision-Language Models Truly Understanding Multi-vision Sensor? Paper • 2412.20750 • Published 5 days ago • 13 • 2
Rethinking Addressing in Language Models via Contexualized Equivariant Positional Encoding Paper • 2501.00712 • Published 3 days ago • 2 • 4
Understanding and Mitigating Bottlenecks of State Space Models through the Lens of Recency and Over-smoothing Paper • 2501.00658 • Published 3 days ago • 5 • 2
MLLM-as-a-Judge for Image Safety without Human Labeling Paper • 2501.00192 • Published 4 days ago • 12 • 2
SeedVR: Seeding Infinity in Diffusion Transformer Towards Generic Video Restoration Paper • 2501.01320 • Published 1 day ago • 4 • 2
VideoRefer Suite: Advancing Spatial-Temporal Object Understanding with Video LLM Paper • 2501.00599 • Published 3 days ago • 27 • 2
CodeElo: Benchmarking Competition-level Code Generation of LLMs with Human-comparable Elo Ratings Paper • 2501.01257 • Published 1 day ago • 30 • 3
Unifying Specialized Visual Encoders for Video Language Models Paper • 2501.01426 • Published 1 day ago • 11 • 2
Reconstruction vs. Generation: Taming Optimization Dilemma in Latent Diffusion Models Paper • 2501.01423 • Published 1 day ago • 25 • 2
2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining Paper • 2501.00958 • Published 2 days ago • 45 • 7
MapQaTor: A System for Efficient Annotation of Map Query Datasets Paper • 2412.21015 • Published 4 days ago • 6 • 2
VideoAnydoor: High-fidelity Video Object Insertion with Precise Motion Control Paper • 2501.01427 • Published 1 day ago • 30 • 2
Nested Attention: Semantic-aware Attention Values for Concept Personalization Paper • 2501.01407 • Published 1 day ago • 1 • 2
Population Aware Diffusion for Time Series Generation Paper • 2501.00910 • Published 2 days ago • 2 • 2
MapEval: A Map-Based Evaluation of Geo-Spatial Reasoning in Foundation Models Paper • 2501.00316 • Published 4 days ago • 16 • 2
ProgCo: Program Helps Self-Correction of Large Language Models Paper • 2501.01264 • Published 1 day ago • 17 • 2
SeFAR: Semi-supervised Fine-grained Action Recognition with Temporal Perturbation and Learning Stabilization Paper • 2501.01245 • Published 1 day ago • 3 • 2
Dynamic Scaling of Unit Tests for Code Reward Modeling Paper • 2501.01054 • Published 2 days ago • 12 • 2