view article Article πΊπ¦ββ¬ LLM Comparison/Test: Phi-4, Qwen2 VL 72B Instruct, Aya Expanse 32B in my updated MMLU-Pro CS benchmark By wolfram β’ 12 days ago β’ 3
Towards Large Reasoning Models: A Survey of Reinforced Reasoning with Large Language Models Paper β’ 2501.09686 β’ Published 6 days ago β’ 35
Learnings from Scaling Visual Tokenizers for Reconstruction and Generation Paper β’ 2501.09755 β’ Published 6 days ago β’ 32
OmniThink: Expanding Knowledge Boundaries in Machine Writing through Thinking Paper β’ 2501.09751 β’ Published 6 days ago β’ 43
Inference-Time Scaling for Diffusion Models beyond Scaling Denoising Steps Paper β’ 2501.09732 β’ Published 6 days ago β’ 63
Do NOT Think That Much for 2+3=? On the Overthinking of o1-Like LLMs Paper β’ 2412.21187 β’ Published 23 days ago β’ 36
OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis Paper β’ 2412.19723 β’ Published 26 days ago β’ 81
CodeElo: Benchmarking Competition-level Code Generation of LLMs with Human-comparable Elo Ratings Paper β’ 2501.01257 β’ Published 20 days ago β’ 48
2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining Paper β’ 2501.00958 β’ Published 21 days ago β’ 97
Test-time Computing: from System-1 Thinking to System-2 Thinking Paper β’ 2501.02497 β’ Published 17 days ago β’ 41
REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models Paper β’ 2501.03262 β’ Published 18 days ago β’ 86
Cosmos World Foundation Model Platform for Physical AI Paper β’ 2501.03575 β’ Published 15 days ago β’ 66
URSA: Understanding and Verifying Chain-of-thought Reasoning in Multimodal Mathematics Paper β’ 2501.04686 β’ Published 14 days ago β’ 50
Search-o1: Agentic Search-Enhanced Large Reasoning Models Paper β’ 2501.05366 β’ Published 13 days ago β’ 77
Agent Laboratory: Using LLM Agents as Research Assistants Paper β’ 2501.04227 β’ Published 14 days ago β’ 80
Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though Paper β’ 2501.04682 β’ Published 14 days ago β’ 87
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking Paper β’ 2501.04519 β’ Published 14 days ago β’ 243
Enhancing Human-Like Responses in Large Language Models Paper β’ 2501.05032 β’ Published 13 days ago β’ 49
LlamaV-o1: Rethinking Step-by-step Visual Reasoning in LLMs Paper β’ 2501.06186 β’ Published 12 days ago β’ 58