Do NOT Think That Much for 2+3=? On the Overthinking of o1-Like LLMs Paper • 2412.21187 • Published 7 days ago • 25
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference Paper • 2412.13663 • Published 19 days ago • 117
Mulberry: Empowering MLLM with o1-like Reasoning and Reflection via Collective Monte Carlo Tree Search Paper • 2412.18319 • Published 13 days ago • 34
Diving into Self-Evolving Training for Multimodal Reasoning Paper • 2412.17451 • Published 14 days ago • 41
B-STaR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners Paper • 2412.17256 • Published 14 days ago • 44
Outcome-Refining Process Supervision for Code Generation Paper • 2412.15118 • Published 18 days ago • 19
DRT-o1: Optimized Deep Reasoning Translation via Long Chain-of-Thought Paper • 2412.17498 • Published 14 days ago • 21
TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks Paper • 2412.14161 • Published 19 days ago • 48
SPaR: Self-Play with Tree-Search Refinement to Improve Instruction-Following in Large Language Models Paper • 2412.11605 • Published 21 days ago • 16
Compressed Chain of Thought: Efficient Reasoning Through Dense Representations Paper • 2412.13171 • Published 20 days ago • 31
Smaller Language Models Are Better Instruction Evolvers Paper • 2412.11231 • Published 22 days ago • 27
Training Large Language Models to Reason in a Continuous Latent Space Paper • 2412.06769 • Published 28 days ago • 66
ProcessBench: Identifying Process Errors in Mathematical Reasoning Paper • 2412.06559 • Published 28 days ago • 72
Evaluating Language Models as Synthetic Data Generators Paper • 2412.03679 • Published Dec 4, 2024 • 46
Critical Tokens Matter: Token-Level Contrastive Estimation Enhence LLM's Reasoning Capability Paper • 2411.19943 • Published Nov 29, 2024 • 56
Beyond Examples: High-level Automated Reasoning Paradigm in In-Context Learning via MCTS Paper • 2411.18478 • Published Nov 27, 2024 • 33
OpenScholar: Synthesizing Scientific Literature with Retrieval-augmented LMs Paper • 2411.14199 • Published Nov 21, 2024 • 30
BALROG: Benchmarking Agentic LLM and VLM Reasoning On Games Paper • 2411.13543 • Published Nov 20, 2024 • 18