-
Creative Robot Tool Use with Large Language Models
Paper • 2310.13065 • Published • 8 -
CodeCoT and Beyond: Learning to Program and Test like a Developer
Paper • 2308.08784 • Published • 5 -
Lemur: Harmonizing Natural Language and Code for Language Agents
Paper • 2310.06830 • Published • 31 -
CodePlan: Repository-level Coding using LLMs and Planning
Paper • 2309.12499 • Published • 74
Collections
Discover the best community collections!
Collections including paper arxiv:2404.02078
-
Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models
Paper • 2310.04406 • Published • 8 -
Chain-of-Thought Reasoning Without Prompting
Paper • 2402.10200 • Published • 104 -
ICDPO: Effectively Borrowing Alignment Capability of Others via In-context Direct Preference Optimization
Paper • 2402.09320 • Published • 6 -
Self-Discover: Large Language Models Self-Compose Reasoning Structures
Paper • 2402.03620 • Published • 114
-
KwaiYiiMath: Technical Report
Paper • 2310.07488 • Published • 2 -
Forward-Backward Reasoning in Large Language Models for Mathematical Verification
Paper • 2308.07758 • Published • 4 -
Natural Language Embedded Programs for Hybrid Language Symbolic Reasoning
Paper • 2309.10814 • Published • 3 -
MathCoder: Seamless Code Integration in LLMs for Enhanced Mathematical Reasoning
Paper • 2310.03731 • Published • 29
-
Ada-Instruct: Adapting Instruction Generators for Complex Reasoning
Paper • 2310.04484 • Published • 5 -
Diversity of Thought Improves Reasoning Abilities of Large Language Models
Paper • 2310.07088 • Published • 5 -
Adapting Large Language Models via Reading Comprehension
Paper • 2309.09530 • Published • 77 -
Democratizing Reasoning Ability: Tailored Learning from Large Language Model
Paper • 2310.13332 • Published • 14
-
Branch-Solve-Merge Improves Large Language Model Evaluation and Generation
Paper • 2310.15123 • Published • 7 -
Let's Synthesize Step by Step: Iterative Dataset Synthesis with Large Language Models by Extrapolating Errors from Small Models
Paper • 2310.13671 • Published • 18 -
Self-Convinced Prompting: Few-Shot Question Answering with Repeated Introspection
Paper • 2310.05035 • Published • 1 -
Chain-of-Thought Reasoning is a Policy Improvement Operator
Paper • 2309.08589 • Published • 1
-
Training Large Language Models to Reason in a Continuous Latent Space
Paper • 2412.06769 • Published • 66 -
Language Models are Hidden Reasoners: Unlocking Latent Reasoning Capabilities via Self-Rewarding
Paper • 2411.04282 • Published • 32 -
Offline Reinforcement Learning for LLM Multi-Step Reasoning
Paper • 2412.16145 • Published • 36 -
MALT: Improving Reasoning with Multi-Agent LLM Training
Paper • 2412.01928 • Published • 40
-
KTO: Model Alignment as Prospect Theoretic Optimization
Paper • 2402.01306 • Published • 16 -
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Paper • 2305.18290 • Published • 50 -
SimPO: Simple Preference Optimization with a Reference-Free Reward
Paper • 2405.14734 • Published • 11 -
Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment
Paper • 2408.06266 • Published • 10
-
LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression
Paper • 2403.12968 • Published • 24 -
PERL: Parameter Efficient Reinforcement Learning from Human Feedback
Paper • 2403.10704 • Published • 57 -
Alignment Studio: Aligning Large Language Models to Particular Contextual Regulations
Paper • 2403.09704 • Published • 31 -
RAFT: Adapting Language Model to Domain Specific RAG
Paper • 2403.10131 • Published • 67
-
Rho-1: Not All Tokens Are What You Need
Paper • 2404.07965 • Published • 88 -
VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time
Paper • 2404.10667 • Published • 18 -
Instruction-tuned Language Models are Better Knowledge Learners
Paper • 2402.12847 • Published • 25 -
DoRA: Weight-Decomposed Low-Rank Adaptation
Paper • 2402.09353 • Published • 26
-
OLMo: Accelerating the Science of Language Models
Paper • 2402.00838 • Published • 82 -
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context
Paper • 2403.05530 • Published • 61 -
StarCoder: may the source be with you!
Paper • 2305.06161 • Published • 29 -
SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling
Paper • 2312.15166 • Published • 56