Ambroser53
's Collections
Alignment
updated
Understanding the performance gap between online and offline alignment
algorithms
Paper
•
2405.08448
•
Published
•
14
Self-Exploring Language Models: Active Preference Elicitation for Online
Alignment
Paper
•
2405.19332
•
Published
•
15
Offline Regularised Reinforcement Learning for Large Language Models
Alignment
Paper
•
2405.19107
•
Published
•
14
Show, Don't Tell: Aligning Language Models with Demonstrated Feedback
Paper
•
2406.00888
•
Published
•
30
Scaling Laws for Reward Model Overoptimization in Direct Alignment
Algorithms
Paper
•
2406.02900
•
Published
•
11
BPO: Supercharging Online Preference Learning by Adhering to the
Proximity of Behavior LLM
Paper
•
2406.12168
•
Published
•
7
Deep Bayesian Active Learning for Preference Modeling in Large Language
Models
Paper
•
2406.10023
•
Published
•
2
Bootstrapping Language Models with DPO Implicit Rewards
Paper
•
2406.09760
•
Published
•
38
Phi-3 Safety Post-Training: Aligning Language Models with a "Break-Fix"
Cycle
Paper
•
2407.13833
•
Published
•
12
Baichuan Alignment Technical Report
Paper
•
2410.14940
•
Published
•
50