-
Agent Q: Advanced Reasoning and Learning for Autonomous AI Agents
Paper • 2408.07199 • Published • 21 -
Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking
Paper • 2403.09629 • Published • 76 -
Learn Beyond The Answer: Training Language Models with Reflection for Mathematical Reasoning
Paper • 2406.12050 • Published • 19 -
Let's Verify Step by Step
Paper • 2305.20050 • Published • 10
Collections
Discover the best community collections!
Collections including paper arxiv:2409.12917
-
LinFusion: 1 GPU, 1 Minute, 16K Image
Paper • 2409.02097 • Published • 33 -
Phidias: A Generative Model for Creating 3D Content from Text, Image, and 3D Conditions with Reference-Augmented Diffusion
Paper • 2409.11406 • Published • 26 -
Diffusion Models Are Real-Time Game Engines
Paper • 2408.14837 • Published • 123 -
Segment Anything with Multiple Modalities
Paper • 2408.09085 • Published • 22
-
FLAME: Factuality-Aware Alignment for Large Language Models
Paper • 2405.01525 • Published • 26 -
DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data
Paper • 2405.14333 • Published • 37 -
Transformers Can Do Arithmetic with the Right Embeddings
Paper • 2405.17399 • Published • 52 -
EasyAnimate: A High-Performance Long Video Generation Method based on Transformer Architecture
Paper • 2405.18991 • Published • 12
-
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 147 -
Self-Discover: Large Language Models Self-Compose Reasoning Structures
Paper • 2402.03620 • Published • 115 -
OS-Copilot: Towards Generalist Computer Agents with Self-Improvement
Paper • 2402.07456 • Published • 43 -
Learning From Mistakes Makes LLM Better Reasoner
Paper • 2310.20689 • Published • 29
-
Training Large Language Models to Reason in a Continuous Latent Space
Paper • 2412.06769 • Published • 74 -
Language Models are Hidden Reasoners: Unlocking Latent Reasoning Capabilities via Self-Rewarding
Paper • 2411.04282 • Published • 33 -
Offline Reinforcement Learning for LLM Multi-Step Reasoning
Paper • 2412.16145 • Published • 38 -
MALT: Improving Reasoning with Multi-Agent LLM Training
Paper • 2412.01928 • Published • 40
-
Writing in the Margins: Better Inference Pattern for Long Context Retrieval
Paper • 2408.14906 • Published • 139 -
Training Language Models to Self-Correct via Reinforcement Learning
Paper • 2409.12917 • Published • 136 -
Towards a Unified View of Preference Learning for Large Language Models: A Survey
Paper • 2409.02795 • Published • 72 -
Attention Heads of Large Language Models: A Survey
Paper • 2409.03752 • Published • 89