Scaling Laws for Floating Point Quantization Training Paper • 2501.02423 • Published 19 days ago • 25
ToolHop: A Query-Driven Benchmark for Evaluating Large Language Models in Multi-Hop Tool Use Paper • 2501.02506 • Published 19 days ago • 10
BoostStep: Boosting mathematical capability of Large Language Models via improved single-step reasoning Paper • 2501.03226 • Published 18 days ago • 37
Test-time Computing: from System-1 Thinking to System-2 Thinking Paper • 2501.02497 • Published 19 days ago • 41