view article Article Yay! Organizations can now publish blog Articles By huggingface • about 14 hours ago • 13
DeepSeek R1 (All Versions) Collection DeepSeek R1 - the most powerful reasoning open-source model - available in GGUF, original & 4-bit formats. Includes Llama & Qwen distilled models. • 26 items • Updated about 6 hours ago • 30
Jan 17 Releases ❄️ Collection Models and datasets of the second week of Jan 2025. • 23 items • Updated 4 days ago • 10
Inference-Time Scaling for Diffusion Models beyond Scaling Denoising Steps Paper • 2501.09732 • Published 5 days ago • 60
OmniThink: Expanding Knowledge Boundaries in Machine Writing through Thinking Paper • 2501.09751 • Published 5 days ago • 39
MMDocIR: Benchmarking Multi-Modal Retrieval for Long Documents Paper • 2501.08828 • Published 6 days ago • 26
RepVideo: Rethinking Cross-Layer Representation for Video Generation Paper • 2501.08994 • Published 6 days ago • 14
MiniMax-01: Scaling Foundation Models with Lightning Attention Paper • 2501.08313 • Published 7 days ago • 263
The Lessons of Developing Process Reward Models in Mathematical Reasoning Paper • 2501.07301 • Published 8 days ago • 83
Multiagent Finetuning: Self Improvement with Diverse Reasoning Chains Paper • 2501.05707 • Published 11 days ago • 19
LlamaV-o1: Rethinking Step-by-step Visual Reasoning in LLMs Paper • 2501.06186 • Published 11 days ago • 57
VideoRAG: Retrieval-Augmented Generation over Video Corpus Paper • 2501.05874 • Published 11 days ago • 65
Enhancing Human-Like Responses in Large Language Models Paper • 2501.05032 • Published 12 days ago • 48
Agent Laboratory: Using LLM Agents as Research Assistants Paper • 2501.04227 • Published 13 days ago • 79
Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though Paper • 2501.04682 • Published 13 days ago • 85
REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models Paper • 2501.03262 • Published 17 days ago • 83
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking Paper • 2501.04519 • Published 13 days ago • 238