Large Language Models Play StarCraft II: Benchmarks and A Chain of Summarization Approach Paper • 2312.11865 • Published Dec 19, 2023
#InsTag: Instruction Tagging for Analyzing Supervised Fine-tuning of Large Language Models Paper • 2308.07074 • Published Aug 14, 2023
Online Merging Optimizers for Boosting Rewards and Mitigating Tax in Alignment Paper • 2405.17931 • Published May 28, 2024
LLM Critics Help Catch Bugs in Mathematics: Towards a Better Mathematical Verifier with Natural Language Feedback Paper • 2406.14024 • Published Jun 20, 2024
Qwen2.5-Math Technical Report: Toward Mathematical Expert Model via Self-Improvement Paper • 2409.12122 • Published Sep 18, 2024 • 3
ProcessBench: Identifying Process Errors in Mathematical Reasoning Paper • 2412.06559 • Published 29 days ago • 72
Routing to the Expert: Efficient Reward-guided Ensemble of Large Language Models Paper • 2311.08692 • Published Nov 15, 2023 • 12