-
Unlocking Continual Learning Abilities in Language Models
Paper • 2406.17245 • Published • 29 -
A Closer Look into Mixture-of-Experts in Large Language Models
Paper • 2406.18219 • Published • 16 -
Symbolic Learning Enables Self-Evolving Agents
Paper • 2406.18532 • Published • 12 -
Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs
Paper • 2406.18629 • Published • 42
Collections
Discover the best community collections!
Collections including paper arxiv:2407.08348
-
Chain-of-Knowledge: Integrating Knowledge Reasoning into Large Language Models by Learning from Knowledge Graphs
Paper • 2407.00653 • Published • 11 -
Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs
Paper • 2406.18629 • Published • 42 -
Whiteboard-of-Thought: Thinking Step-by-Step Across Modalities
Paper • 2406.14562 • Published • 28 -
Buffer of Thoughts: Thought-Augmented Reasoning with Large Language Models
Paper • 2406.04271 • Published • 29
-
RLHF Workflow: From Reward Modeling to Online RLHF
Paper • 2405.07863 • Published • 67 -
Chameleon: Mixed-Modal Early-Fusion Foundation Models
Paper • 2405.09818 • Published • 129 -
Meteor: Mamba-based Traversal of Rationale for Large Language and Vision Models
Paper • 2405.15574 • Published • 53 -
An Introduction to Vision-Language Modeling
Paper • 2405.17247 • Published • 87
-
Advancing LLM Reasoning Generalists with Preference Trees
Paper • 2404.02078 • Published • 44 -
ChatGLM-Math: Improving Math Problem-Solving in Large Language Models with a Self-Critique Pipeline
Paper • 2404.02893 • Published • 21 -
MetaMath: Bootstrap Your Own Mathematical Questions for Large Language Models
Paper • 2309.12284 • Published • 19 -
Premise Order Matters in Reasoning with Large Language Models
Paper • 2402.08939 • Published • 28
-
InternLM2 Technical Report
Paper • 2403.17297 • Published • 30 -
sDPO: Don't Use Your Data All at Once
Paper • 2403.19270 • Published • 41 -
Learn Your Reference Model for Real Good Alignment
Paper • 2404.09656 • Published • 83 -
OpenBezoar: Small, Cost-Effective and Open Models Trained on Mixes of Instruction Data
Paper • 2404.12195 • Published • 12
-
Training Verifiers to Solve Math Word Problems
Paper • 2110.14168 • Published • 4 -
MetaMath: Bootstrap Your Own Mathematical Questions for Large Language Models
Paper • 2309.12284 • Published • 19 -
LiteSearch: Efficacious Tree Search for LLM
Paper • 2407.00320 • Published • 38 -
DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models
Paper • 2309.03883 • Published • 34