Discovering the Gems in Early Layers: Accelerating Long-Context LLMs with 1000x Input Token Reduction Paper • 2409.17422 • Published Sep 25, 2024 • 25 • 5
EuroLLM: Multilingual Language Models for Europe Paper • 2409.16235 • Published Sep 24, 2024 • 26 • 4
BaichuanSEED: Sharing the Potential of ExtensivE Data Collection and Deduplication by Introducing a Competitive Large Language Model Baseline Paper • 2408.15079 • Published Aug 27, 2024 • 52 • 4
Dolphin: Long Context as a New Modality for Energy-Efficient On-Device Language Models Paper • 2408.15518 • Published Aug 28, 2024 • 42 • 4
Better Alignment with Instruction Back-and-Forth Translation Paper • 2408.04614 • Published Aug 8, 2024 • 15 • 3
Gemma 2: Improving Open Language Models at a Practical Size Paper • 2408.00118 • Published Jul 31, 2024 • 76 • 3
MInference 1.0: Accelerating Pre-filling for Long-Context LLMs via Dynamic Sparse Attention Paper • 2407.02490 • Published Jul 2, 2024 • 23 • 4
Self-Play Preference Optimization for Language Model Alignment Paper • 2405.00675 • Published May 1, 2024 • 25 • 7
Simple linear attention language models balance the recall-throughput tradeoff Paper • 2402.18668 • Published Feb 28, 2024 • 18 • 12
Training-Free Long-Context Scaling of Large Language Models Paper • 2402.17463 • Published Feb 27, 2024 • 19 • 3
MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases Paper • 2402.14905 • Published Feb 22, 2024 • 126 • 13