LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens Paper • 2402.13753 • Published Feb 21, 2024 • 114
LongAgent: Scaling Language Models to 128k Context through Multi-Agent Collaboration Paper • 2402.11550 • Published Feb 18, 2024 • 16
LongAlign: A Recipe for Long Context Alignment of Large Language Models Paper • 2401.18058 • Published Jan 31, 2024 • 20
Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention Paper • 2404.07143 • Published Apr 10, 2024 • 104
Long-Context Language Modeling with Parallel Context Encoding Paper • 2402.16617 • Published Feb 26, 2024 • 1
BABILong: Testing the Limits of LLMs with Long Context Reasoning-in-a-Haystack Paper • 2406.10149 • Published Jun 14, 2024 • 48
RULER: What's the Real Context Size of Your Long-Context Language Models? Paper • 2404.06654 • Published Apr 9, 2024 • 34
Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length Paper • 2404.08801 • Published Apr 12, 2024 • 64
LongSkywork: A Training Recipe for Efficiently Extending Context Length in Large Language Models Paper • 2406.00605 • Published Jun 2, 2024 • 2
Beyond the Limits: A Survey of Techniques to Extend the Context Length in Large Language Models Paper • 2402.02244 • Published Feb 3, 2024 • 1
Resonance RoPE: Improving Context Length Generalization of Large Language Models Paper • 2403.00071 • Published Feb 29, 2024 • 22
Multimodal Needle in a Haystack: Benchmarking Long-Context Capability of Multimodal Large Language Models Paper • 2406.11230 • Published Jun 17, 2024 • 33
Long Code Arena: a Set of Benchmarks for Long-Context Code Models Paper • 2406.11612 • Published Jun 17, 2024 • 24
Found in the Middle: Calibrating Positional Attention Bias Improves Long Context Utilization Paper • 2406.16008 • Published Jun 23, 2024 • 6
LongRAG: Enhancing Retrieval-Augmented Generation with Long-context LLMs Paper • 2406.15319 • Published Jun 21, 2024 • 62
Sparser is Faster and Less is More: Efficient Sparse Attention for Long-Range Transformers Paper • 2406.16747 • Published Jun 24, 2024 • 18
Can Few-shot Work in Long-Context? Recycling the Context to Generate Demonstrations Paper • 2406.13632 • Published Jun 19, 2024 • 5
LongIns: A Challenging Long-context Instruction-based Exam for LLMs Paper • 2406.17588 • Published Jun 25, 2024 • 22
Training-Free Long-Context Scaling of Large Language Models Paper • 2402.17463 • Published Feb 27, 2024 • 19
Leave No Document Behind: Benchmarking Long-Context LLMs with Extended Multi-Doc QA Paper • 2406.17419 • Published Jun 25, 2024 • 17
Long Context is Not Long at All: A Prospector of Long-Dependency Data for Large Language Models Paper • 2405.17915 • Published May 28, 2024 • 2
Summary of a Haystack: A Challenge to Long-Context LLMs and RAG Systems Paper • 2407.01370 • Published Jul 1, 2024 • 86
Human-like Episodic Memory for Infinite Context LLMs Paper • 2407.09450 • Published Jul 12, 2024 • 60
NeedleBench: Can LLMs Do Retrieval and Reasoning in 1 Million Context Window? Paper • 2407.11963 • Published Jul 16, 2024 • 43
LazyLLM: Dynamic Token Pruning for Efficient Long Context LLM Inference Paper • 2407.14057 • Published Jul 19, 2024 • 45
ChatQA 2: Bridging the Gap to Proprietary LLMs in Long Context and RAG Capabilities Paper • 2407.14482 • Published Jul 19, 2024 • 26
Writing in the Margins: Better Inference Pattern for Long Context Retrieval Paper • 2408.14906 • Published Aug 27, 2024 • 138
LongCite: Enabling LLMs to Generate Fine-grained Citations in Long-context QA Paper • 2409.02897 • Published Sep 4, 2024 • 44
LongRecipe: Recipe for Efficient Long Context Generalization in Large Languge Models Paper • 2409.00509 • Published Aug 31, 2024 • 37
HelloBench: Evaluating Long Text Generation Capabilities of Large Language Models Paper • 2409.16191 • Published Sep 24, 2024 • 41
RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval Paper • 2409.10516 • Published Sep 16, 2024 • 40
Untie the Knots: An Efficient Data Augmentation Strategy for Long-Context Pre-Training in Language Models Paper • 2409.04774 • Published Sep 7, 2024
L-CiteEval: Do Long-Context Models Truly Leverage Context for Responding? Paper • 2410.02115 • Published Oct 3, 2024 • 10
Minimum Tuning to Unlock Long Output from LLMs with High Quality Data as the Key Paper • 2410.10210 • Published Oct 14, 2024 • 5
LongReward: Improving Long-context Large Language Models with AI Feedback Paper • 2410.21252 • Published Oct 28, 2024 • 17
Why Does the Effective Context Length of LLMs Fall Short? Paper • 2410.18745 • Published Oct 24, 2024 • 17
Language Models can Self-Lengthen to Generate Long Texts Paper • 2410.23933 • Published Oct 31, 2024 • 17
Large Language Models Can Self-Improve in Long-context Reasoning Paper • 2411.08147 • Published Nov 12, 2024 • 62
Star Attention: Efficient LLM Inference over Long Sequences Paper • 2411.17116 • Published Nov 26, 2024 • 47
Needle Threading: Can LLMs Follow Threads through Near-Million-Scale Haystacks? Paper • 2411.05000 • Published Nov 7, 2024 • 21
How to Train Long-Context Language Models (Effectively) Paper • 2410.02660 • Published Oct 3, 2024 • 2
When Precision Meets Position: BFloat16 Breaks Down RoPE in Long-Context Training Paper • 2411.13476 • Published Nov 20, 2024 • 15
LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs Paper • 2408.07055 • Published Aug 13, 2024 • 65