-
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
Paper • 2401.02954 • Published • 41 -
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models
Paper • 2401.06066 • Published • 44 -
DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence
Paper • 2401.14196 • Published • 48 -
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
Paper • 2402.03300 • Published • 76
Collections
Discover the best community collections!
Collections including paper arxiv:2402.03300
-
From Quantity to Quality: Boosting LLM Performance with Self-Guided Data Selection for Instruction Tuning
Paper • 2308.12032 • Published • 1 -
Know thy corpus! Robust methods for digital curation of Web corpora
Paper • 2003.06389 • Published • 1 -
Self-Alignment with Instruction Backtranslation
Paper • 2308.06259 • Published • 41 -
The Vault: A Comprehensive Multilingual Dataset for Advancing Code Understanding and Generation
Paper • 2305.06156 • Published • 2
-
deepseek-ai/deepseek-math-7b-instruct
Text Generation • Updated • 7.45k • 98 -
deepseek-ai/deepseek-math-7b-rl
Text Generation • Updated • 610 • 58 -
deepseek-ai/deepseek-math-7b-base
Text Generation • Updated • 2.47k • 53 -
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
Paper • 2402.03300 • Published • 76
-
KwaiYiiMath: Technical Report
Paper • 2310.07488 • Published • 2 -
Forward-Backward Reasoning in Large Language Models for Mathematical Verification
Paper • 2308.07758 • Published • 4 -
Natural Language Embedded Programs for Hybrid Language Symbolic Reasoning
Paper • 2309.10814 • Published • 3 -
MathCoder: Seamless Code Integration in LLMs for Enhanced Mathematical Reasoning
Paper • 2310.03731 • Published • 29
-
Moral Foundations of Large Language Models
Paper • 2310.15337 • Published • 1 -
Specific versus General Principles for Constitutional AI
Paper • 2310.13798 • Published • 2 -
Contrastive Prefence Learning: Learning from Human Feedback without RL
Paper • 2310.13639 • Published • 24 -
RLAIF: Scaling Reinforcement Learning from Human Feedback with AI Feedback
Paper • 2309.00267 • Published • 47
-
Qwen2.5-Coder Technical Report
Paper • 2409.12186 • Published • 139 -
Qwen2.5-Math Technical Report: Toward Mathematical Expert Model via Self-Improvement
Paper • 2409.12122 • Published • 3 -
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
Paper • 2405.04434 • Published • 14 -
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
Paper • 2402.03300 • Published • 76