Abstract
Currently OpenAI o1 has sparked a surge of interest in the study of large reasoning models (LRM). Building on this momentum, Marco-o1 not only focuses on disciplines with standard answers, such as mathematics, physics, and coding -- which are well-suited for reinforcement learning (RL) -- but also places greater emphasis on open-ended resolutions. We aim to address the question: "Can the o1 model effectively generalize to broader domains where clear standards are absent and rewards are challenging to quantify?" Marco-o1 is powered by Chain-of-Thought (CoT) fine-tuning, Monte Carlo Tree Search (MCTS), reflection mechanisms, and innovative reasoning strategies -- optimized for complex real-world problem-solving tasks.
Community
Super cool! Congrats on the releaseπ₯
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models (2024)
- AtomThink: A Slow Thinking Framework for Multimodal Mathematical Reasoning (2024)
- Reasoning Paths Optimization: Learning to Reason and Explore From Diverse Paths (2024)
- LLaMA-Berry: Pairwise Optimization for O1-like Olympiad-Level Mathematical Reasoning (2024)
- Interpretable Contrastive Monte Carlo Tree Search Reasoning (2024)
- DOTS: Learning to Reason Dynamically in LLMs via Optimal Reasoning Trajectories Search (2024)
- Rational Metareasoning for Large Language Models (2024)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Hey
Models citing this paper 12
Browse 12 models citing this paperDatasets citing this paper 0
No dataset linking this paper