-
World Model on Million-Length Video And Language With RingAttention
Paper • 2402.08268 • Published • 38 -
Improving Text Embeddings with Large Language Models
Paper • 2401.00368 • Published • 80 -
Chain-of-Thought Reasoning Without Prompting
Paper • 2402.10200 • Published • 105 -
FiT: Flexible Vision Transformer for Diffusion Model
Paper • 2402.12376 • Published • 48
Collections
Discover the best community collections!
Collections including paper arxiv:2406.18629
-
Suppressing Pink Elephants with Direct Principle Feedback
Paper • 2402.07896 • Published • 10 -
Policy Improvement using Language Feedback Models
Paper • 2402.07876 • Published • 6 -
Direct Language Model Alignment from Online AI Feedback
Paper • 2402.04792 • Published • 30 -
Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models
Paper • 2401.01335 • Published • 64
-
Orca 2: Teaching Small Language Models How to Reason
Paper • 2311.11045 • Published • 72 -
Learning From Mistakes Makes LLM Better Reasoner
Paper • 2310.20689 • Published • 29 -
Let's Verify Step by Step
Paper • 2305.20050 • Published • 10 -
SelfCheck: Using LLMs to Zero-Shot Check Their Own Step-by-Step Reasoning
Paper • 2308.00436 • Published • 22
-
Trusted Source Alignment in Large Language Models
Paper • 2311.06697 • Published • 11 -
Diffusion Model Alignment Using Direct Preference Optimization
Paper • 2311.12908 • Published • 48 -
SuperHF: Supervised Iterative Learning from Human Feedback
Paper • 2310.16763 • Published • 1 -
Enhancing Diffusion Models with Text-Encoder Reinforcement Learning
Paper • 2311.15657 • Published • 2