Advancing LLM Reasoning Generalists with Preference Trees Paper • 2404.02078 • Published Apr 2, 2024 • 44
mDPO: Conditional Preference Optimization for Multimodal Large Language Models Paper • 2406.11839 • Published Jun 17, 2024 • 38
Instruction Pre-Training: Language Models are Supervised Multitask Learners Paper • 2406.14491 • Published Jun 20, 2024 • 87