-
KTO: Model Alignment as Prospect Theoretic Optimization
Paper • 2402.01306 • Published • 16 -
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Paper • 2305.18290 • Published • 50 -
SimPO: Simple Preference Optimization with a Reference-Free Reward
Paper • 2405.14734 • Published • 11 -
Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment
Paper • 2408.06266 • Published • 9
Collections
Discover the best community collections!
Collections including paper arxiv:2401.08417
-
Contrastive Preference Optimization: Pushing the Boundaries of LLM Performance in Machine Translation
Paper • 2401.08417 • Published • 34 -
Towards Boosting Many-to-Many Multilingual Machine Translation with Large Language Models
Paper • 2401.05861 • Published -
Adapting Large Language Models for Document-Level Machine Translation
Paper • 2401.06468 • Published -
Tuning LLMs with Contrastive Alignment Instructions for Machine Translation in Unseen, Low-resource Languages
Paper • 2401.05811 • Published • 6
-
A Critical Evaluation of AI Feedback for Aligning Large Language Models
Paper • 2402.12366 • Published • 3 -
PERL: Parameter Efficient Reinforcement Learning from Human Feedback
Paper • 2403.10704 • Published • 57 -
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
Paper • 2403.03507 • Published • 183 -
Contrastive Preference Optimization: Pushing the Boundaries of LLM Performance in Machine Translation
Paper • 2401.08417 • Published • 34
-
A Paradigm Shift in Machine Translation: Boosting Translation Performance of Large Language Models
Paper • 2309.11674 • Published • 31 -
Extrapolating Large Language Models to Non-English by Aligning Languages
Paper • 2308.04948 • Published • 1 -
Contrastive Preference Optimization: Pushing the Boundaries of LLM Performance in Machine Translation
Paper • 2401.08417 • Published • 34
-
Contrastive Preference Optimization: Pushing the Boundaries of LLM Performance in Machine Translation
Paper • 2401.08417 • Published • 34 -
PIXART-δ: Fast and Controllable Image Generation with Latent Consistency Models
Paper • 2401.05252 • Published • 47 -
BiLLM: Pushing the Limit of Post-Training Quantization for LLMs
Paper • 2402.04291 • Published • 48 -
FiT: Flexible Vision Transformer for Diffusion Model
Paper • 2402.12376 • Published • 48
-
System 2 Attention (is something you might need too)
Paper • 2311.11829 • Published • 39 -
TPTU-v2: Boosting Task Planning and Tool Usage of Large Language Model-based Agents in Real-world Systems
Paper • 2311.11315 • Published • 6 -
Alignment for Honesty
Paper • 2312.07000 • Published • 11 -
Steering Llama 2 via Contrastive Activation Addition
Paper • 2312.06681 • Published • 11