2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining Paper • 2501.00958 • Published 4 days ago • 75
ProgCo: Program Helps Self-Correction of Large Language Models Paper • 2501.01264 • Published 4 days ago • 22
Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey Paper • 2412.18619 • Published 21 days ago • 49
HumanEval Pro and MBPP Pro: Evaluating Large Language Models on Self-invoking Code Generation Paper • 2412.21199 • Published 6 days ago • 9
Training Software Engineering Agents and Verifiers with SWE-Gym Paper • 2412.21139 • Published 6 days ago • 16
Efficiently Serving LLM Reasoning Programs with Certaindex Paper • 2412.20993 • Published 7 days ago • 29
In Case You Missed It: ARC 'Challenge' Is Not That Challenging Paper • 2412.17758 • Published 13 days ago • 16
Mulberry: Empowering MLLM with o1-like Reasoning and Reflection via Collective Monte Carlo Tree Search Paper • 2412.18319 • Published 13 days ago • 34
MMFactory: A Universal Solution Search Engine for Vision-Language Tasks Paper • 2412.18072 • Published 13 days ago • 14
HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs Paper • 2412.18925 • Published 12 days ago • 86
PC Agent: While You Sleep, AI Works -- A Cognitive Journey into Digital World Paper • 2412.17589 • Published 14 days ago • 12
Diving into Self-Evolving Training for Multimodal Reasoning Paper • 2412.17451 • Published 14 days ago • 41