Instruction Pre-Training: Language Models are Supervised Multitask Learners Paper โข 2406.14491 โข Published Jun 20, 2024 โข 86
The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale Paper โข 2406.17557 โข Published Jun 25, 2024 โข 88
KTO: Model Alignment as Prospect Theoretic Optimization Paper โข 2402.01306 โข Published Feb 2, 2024 โข 16
NuminaMath Collection Datasets and models for training SOTA math LLMs. See our GitHub for training & inference code: https://github.com/project-numina/aimo-progress-prize โข 6 items โข Updated Jul 21, 2024 โข 69
Large Language Models Can Self-Improve in Long-context Reasoning Paper โข 2411.08147 โข Published Nov 12, 2024 โข 63
O1 Replication Journey -- Part 2: Surpassing O1-preview through Simple Distillation, Big Progress or Bitter Lesson? Paper โข 2411.16489 โข Published Nov 25, 2024 โข 41
Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking Paper โข 2403.09629 โข Published Mar 14, 2024 โข 75
V-STaR: Training Verifiers for Self-Taught Reasoners Paper โข 2402.06457 โข Published Feb 9, 2024 โข 9
Learn Beyond The Answer: Training Language Models with Reflection for Mathematical Reasoning Paper โข 2406.12050 โข Published Jun 17, 2024 โข 19
Top LLM Collection Collection of TOP Open Source LLM, Sort by Best on top โข 6 items โข Updated Jul 26, 2024 โข 13
Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level Paper โข 2411.03562 โข Published Nov 5, 2024 โข 64
What Happened in LLMs Layers when Trained for Fast vs. Slow Thinking: A Gradient Perspective Paper โข 2410.23743 โข Published Oct 31, 2024 โข 59
LLMs Know More Than They Show: On the Intrinsic Representation of LLM Hallucinations Paper โข 2410.02707 โข Published Oct 3, 2024 โข 48