Adam Yanxiao Zhao's picture

11 7

Adam Yanxiao Zhao

sdpkjc

·

https://sdpkjc.com

AI & ML interests

Reinforcement Learning

Recent Activity

upvoted a collection 19 days ago

LLM Reasoning Papers

liked a dataset 25 days ago

HuggingFaceH4/MATH-500

liked a model 26 days ago

liuhaotian/llava-v1.6-vicuna-7b

View all activity

Organizations

sdpkjc's activity

upvoted a collection 19 days ago

LLM Reasoning Papers

Papers to improve reasoning capabilities of LLMs • 17 items • Updated 14 days ago • 93

upvoted a paper about 1 month ago

A Simple and Provable Scaling Law for the Test-Time Compute of Large Language Models

Paper • 2411.19477 • Published Nov 29, 2024 • 5

upvoted a paper about 2 months ago

The N+ Implementation Details of RLHF with PPO: A Case Study on TL;DR Summarization

Paper • 2403.17031 • Published Mar 24, 2024 • 3

upvoted a paper 2 months ago

Robot Utility Models: General Policies for Zero-Shot Deployment in New Environments

Paper • 2409.05865 • Published Sep 9, 2024 • 14

upvoted a paper 4 months ago

Diffusion Policy Policy Optimization

Paper • 2409.00588 • Published Sep 1, 2024 • 20

upvoted a paper 5 months ago

D5RL: Diverse Datasets for Data-Driven Deep Reinforcement Learning

Paper • 2408.08441 • Published Aug 15, 2024 • 7

upvoted 3 papers 6 months ago

On the Transformations across Reward Model, Parameter Update, and In-Context Prompt

Paper • 2406.16377 • Published Jun 24, 2024 • 11

FreeTraj: Tuning-Free Trajectory Control in Video Diffusion Models

Paper • 2406.16863 • Published Jun 24, 2024 • 10

TextGrad: Automatic "Differentiation" via Text

Paper • 2406.07496 • Published Jun 11, 2024 • 27

upvoted 2 papers 10 months ago

Open RL Benchmark: Comprehensive Tracked Experiments for Reinforcement Learning

Paper • 2402.03046 • Published Feb 5, 2024 • 6

Snapshot Reinforcement Learning: Leveraging Prior Trajectories for Efficiency

Paper • 2403.00673 • Published Mar 1, 2024 • 1