SeFAR: Semi-supervised Fine-grained Action Recognition with Temporal Perturbation and Learning Stabilization Paper • 2501.01245 • Published 1 day ago • 3
SeedVR: Seeding Infinity in Diffusion Transformer Towards Generic Video Restoration Paper • 2501.01320 • Published 1 day ago • 5
MLLM-as-a-Judge for Image Safety without Human Labeling Paper • 2501.00192 • Published 4 days ago • 13
MapQaTor: A System for Efficient Annotation of Map Query Datasets Paper • 2412.21015 • Published 5 days ago • 7
Dynamic Scaling of Unit Tests for Code Reward Modeling Paper • 2501.01054 • Published 2 days ago • 12
MapEval: A Map-Based Evaluation of Geo-Spatial Reasoning in Foundation Models Paper • 2501.00316 • Published 4 days ago • 17
ProgCo: Program Helps Self-Correction of Large Language Models Paper • 2501.01264 • Published 1 day ago • 18
Reconstruction vs. Generation: Taming Optimization Dilemma in Latent Diffusion Models Paper • 2501.01423 • Published 1 day ago • 25
VideoRefer Suite: Advancing Spatial-Temporal Object Understanding with Video LLM Paper • 2501.00599 • Published 3 days ago • 28
VideoAnydoor: High-fidelity Video Object Insertion with Precise Motion Control Paper • 2501.01427 • Published 1 day ago • 31
2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining Paper • 2501.00958 • Published 2 days ago • 46
HUNYUANPROVER: A Scalable Data Synthesis Framework and Guided Tree Search for Automated Theorem Proving Paper • 2412.20735 • Published 5 days ago • 7
OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis Paper • 2412.19723 • Published 7 days ago • 63
Efficiently Serving LLM Reasoning Programs with Certaindex Paper • 2412.20993 • Published 5 days ago • 28
Explanatory Instructions: Towards Unified Vision Tasks Understanding and Zero-shot Generalization Paper • 2412.18525 • Published 10 days ago • 59