-
LLoCO: Learning Long Contexts Offline
Paper • 2404.07979 • Published • 20 -
LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens
Paper • 2402.13753 • Published • 114 -
LongAgent: Scaling Language Models to 128k Context through Multi-Agent Collaboration
Paper • 2402.11550 • Published • 16 -
LongAlign: A Recipe for Long Context Alignment of Large Language Models
Paper • 2401.18058 • Published • 20
Collections
Discover the best community collections!
Collections including paper arxiv:2402.11550
-
Communicative Agents for Software Development
Paper • 2307.07924 • Published • 4 -
Self-Refine: Iterative Refinement with Self-Feedback
Paper • 2303.17651 • Published • 2 -
ReST meets ReAct: Self-Improvement for Multi-Step Reasoning LLM Agent
Paper • 2312.10003 • Published • 37 -
ReAct: Synergizing Reasoning and Acting in Language Models
Paper • 2210.03629 • Published • 15
-
TofuEval: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue Summarization
Paper • 2402.13249 • Published • 11 -
The FinBen: An Holistic Financial Benchmark for Large Language Models
Paper • 2402.12659 • Published • 17 -
Instruction-tuned Language Models are Better Knowledge Learners
Paper • 2402.12847 • Published • 25 -
Synthetic Data (Almost) from Scratch: Generalized Instruction Tuning for Language Models
Paper • 2402.13064 • Published • 47
-
LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens
Paper • 2402.13753 • Published • 114 -
Data Engineering for Scaling Language Models to 128K Context
Paper • 2402.10171 • Published • 23 -
LongAgent: Scaling Language Models to 128k Context through Multi-Agent Collaboration
Paper • 2402.11550 • Published • 16 -
The What, Why, and How of Context Length Extension Techniques in Large Language Models -- A Detailed Survey
Paper • 2401.07872 • Published • 2
-
InfLLM: Unveiling the Intrinsic Capacity of LLMs for Understanding Extremely Long Sequences with Training-Free Memory
Paper • 2402.04617 • Published • 4 -
BurstAttention: An Efficient Distributed Attention Framework for Extremely Long Sequences
Paper • 2403.09347 • Published • 20 -
Resonance RoPE: Improving Context Length Generalization of Large Language Models
Paper • 2403.00071 • Published • 22 -
Training-Free Long-Context Scaling of Large Language Models
Paper • 2402.17463 • Published • 19
-
In Search of Needles in a 10M Haystack: Recurrent Memory Finds What LLMs Miss
Paper • 2402.10790 • Published • 41 -
LongAgent: Scaling Language Models to 128k Context through Multi-Agent Collaboration
Paper • 2402.11550 • Published • 16 -
A Neural Conversational Model
Paper • 1506.05869 • Published • 2 -
Data Engineering for Scaling Language Models to 128K Context
Paper • 2402.10171 • Published • 23
-
SliceGPT: Compress Large Language Models by Deleting Rows and Columns
Paper • 2401.15024 • Published • 69 -
FP6-LLM: Efficiently Serving Large Language Models Through FP6-Centric Algorithm-System Co-Design
Paper • 2401.14112 • Published • 18 -
WebVoyager: Building an End-to-End Web Agent with Large Multimodal Models
Paper • 2401.13919 • Published • 27 -
Sketch2NeRF: Multi-view Sketch-guided Text-to-3D Generation
Paper • 2401.14257 • Published • 10
-
Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models
Paper • 2402.19427 • Published • 52 -
Simple linear attention language models balance the recall-throughput tradeoff
Paper • 2402.18668 • Published • 18 -
ChunkAttention: Efficient Self-Attention with Prefix-Aware KV Cache and Two-Phase Partition
Paper • 2402.15220 • Published • 19 -
Linear Transformers are Versatile In-Context Learners
Paper • 2402.14180 • Published • 6
-
LLM Augmented LLMs: Expanding Capabilities through Composition
Paper • 2401.02412 • Published • 36 -
LongAgent: Scaling Language Models to 128k Context through Multi-Agent Collaboration
Paper • 2402.11550 • Published • 16 -
A Multimodal Automated Interpretability Agent
Paper • 2404.14394 • Published • 20