LLM Reasoning Papers Collection Papers to improve reasoning capabilities of LLMs • 17 items • Updated 14 days ago • 93
A Simple and Provable Scaling Law for the Test-Time Compute of Large Language Models Paper • 2411.19477 • Published Nov 29, 2024 • 5
The N+ Implementation Details of RLHF with PPO: A Case Study on TL;DR Summarization Paper • 2403.17031 • Published Mar 24, 2024 • 3
Robot Utility Models: General Policies for Zero-Shot Deployment in New Environments Paper • 2409.05865 • Published Sep 9, 2024 • 14
D5RL: Diverse Datasets for Data-Driven Deep Reinforcement Learning Paper • 2408.08441 • Published Aug 15, 2024 • 7
On the Transformations across Reward Model, Parameter Update, and In-Context Prompt Paper • 2406.16377 • Published Jun 24, 2024 • 11
FreeTraj: Tuning-Free Trajectory Control in Video Diffusion Models Paper • 2406.16863 • Published Jun 24, 2024 • 10
Open RL Benchmark: Comprehensive Tracked Experiments for Reinforcement Learning Paper • 2402.03046 • Published Feb 5, 2024 • 6
Snapshot Reinforcement Learning: Leveraging Prior Trajectories for Efficiency Paper • 2403.00673 • Published Mar 1, 2024 • 1