Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2402.10790

LLM-LongContext-Compression

Papers in terms of LLM LongContext Compression way, Reading Details: https://www.notion.so/LLM-LongContext-Compression-323cc6da39124c3a97d3502e1bf61b7

Adapting Language Models to Compress Contexts

Paper • 2305.14788 • Published May 24, 2023 • 1
Soaring from 4K to 400K: Extending LLM's Context with Activation Beacon

Paper • 2401.03462 • Published Jan 7, 2024 • 27
Flexibly Scaling Large Language Models Contexts Through Extensible Tokenization

Paper • 2401.07793 • Published Jan 15, 2024 • 3
Say More with Less: Understanding Prompt Learning Behaviors through Gist Compression

Paper • 2402.16058 • Published Feb 25, 2024

TofuEval: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue Summarization

Paper • 2402.13249 • Published Feb 20, 2024 • 11
The FinBen: An Holistic Financial Benchmark for Large Language Models

Paper • 2402.12659 • Published Feb 20, 2024 • 17
Instruction-tuned Language Models are Better Knowledge Learners

Paper • 2402.12847 • Published Feb 20, 2024 • 25
Synthetic Data (Almost) from Scratch: Generalized Instruction Tuning for Language Models

Paper • 2402.13064 • Published Feb 20, 2024 • 47

Transformers Can Achieve Length Generalization But Not Robustly

Paper • 2402.09371 • Published Feb 14, 2024 • 13
DataDreamer: A Tool for Synthetic Data Generation and Reproducible LLM Workflows

Paper • 2402.10379 • Published Feb 16, 2024 • 30
In Search of Needles in a 10M Haystack: Recurrent Memory Finds What LLMs Miss

Paper • 2402.10790 • Published Feb 16, 2024 • 41

Lee's RoPE Tricks / Context Extension Reads

Set of Long Context (RoPE or otherwise) I'm collecting off of HF

LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens

Paper • 2402.13753 • Published Feb 21, 2024 • 114
Data Engineering for Scaling Language Models to 128K Context

Paper • 2402.10171 • Published Feb 15, 2024 • 23
LongAgent: Scaling Language Models to 128k Context through Multi-Agent Collaboration

Paper • 2402.11550 • Published Feb 18, 2024 • 16
The What, Why, and How of Context Length Extension Techniques in Large Language Models -- A Detailed Survey

Paper • 2401.07872 • Published Jan 15, 2024 • 2

InfLLM: Unveiling the Intrinsic Capacity of LLMs for Understanding Extremely Long Sequences with Training-Free Memory

Paper • 2402.04617 • Published Feb 7, 2024 • 4
BurstAttention: An Efficient Distributed Attention Framework for Extremely Long Sequences

Paper • 2403.09347 • Published Mar 14, 2024 • 20
Resonance RoPE: Improving Context Length Generalization of Large Language Models

Paper • 2403.00071 • Published Feb 29, 2024 • 22
Training-Free Long-Context Scaling of Large Language Models

Paper • 2402.17463 • Published Feb 27, 2024 • 19

In Search of Needles in a 10M Haystack: Recurrent Memory Finds What LLMs Miss

Paper • 2402.10790 • Published Feb 16, 2024 • 41
LLM Comparator: Visual Analytics for Side-by-Side Evaluation of Large Language Models

Paper • 2402.10524 • Published Feb 16, 2024 • 22
CompassJudger-1: All-in-one Judge Model Helps Model Evaluation and Evolution

Paper • 2410.16256 • Published Oct 21, 2024 • 59
Code Llama: Open Foundation Models for Code

Paper • 2308.12950 • Published Aug 24, 2023 • 24

Papers - Context

In Search of Needles in a 10M Haystack: Recurrent Memory Finds What LLMs Miss

Paper • 2402.10790 • Published Feb 16, 2024 • 41
LongAgent: Scaling Language Models to 128k Context through Multi-Agent Collaboration

Paper • 2402.11550 • Published Feb 18, 2024 • 16
A Neural Conversational Model

Paper • 1506.05869 • Published Jun 19, 2015 • 2
Data Engineering for Scaling Language Models to 128K Context

Paper • 2402.10171 • Published Feb 15, 2024 • 23

In Search of Needles in a 10M Haystack: Recurrent Memory Finds What LLMs Miss

Paper • 2402.10790 • Published Feb 16, 2024 • 41

Papers is all you need

In Search of Needles in a 10M Haystack: Recurrent Memory Finds What LLMs Miss

Paper • 2402.10790 • Published Feb 16, 2024 • 41

Data Efficient Approaches

How to Train Data-Efficient LLMs

Paper • 2402.09668 • Published Feb 15, 2024 • 40
LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement

Paper • 2403.15042 • Published Mar 22, 2024 • 25
MAGID: An Automated Pipeline for Generating Synthetic Multi-modal Datasets

Paper • 2403.03194 • Published Mar 5, 2024 • 12
Orca-Math: Unlocking the potential of SLMs in Grade School Math

Paper • 2402.14830 • Published Feb 16, 2024 • 24

Previous
1
2
3
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs