Tokenization Falling Short: The Curse of Tokenization Paper • 2406.11687 • Published Jun 17, 2024 • 15
TroL: Traversal of Layers for Large Language and Vision Models Paper • 2406.12246 • Published Jun 18, 2024 • 34
PowerInfer-2: Fast Large Language Model Inference on a Smartphone Paper • 2406.06282 • Published Jun 10, 2024 • 36
The Prompt Report: A Systematic Survey of Prompting Techniques Paper • 2406.06608 • Published Jun 6, 2024 • 56
Zipper: A Multi-Tower Decoder Architecture for Fusing Modalities Paper • 2405.18669 • Published May 29, 2024 • 11
Chameleon: Mixed-Modal Early-Fusion Foundation Models Paper • 2405.09818 • Published May 16, 2024 • 126
Better & Faster Large Language Models via Multi-token Prediction Paper • 2404.19737 • Published Apr 30, 2024 • 73
Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing Paper • 2404.12253 • Published Apr 18, 2024 • 54
The Instruction Hierarchy: Training LLMs to Prioritize Privileged Instructions Paper • 2404.13208 • Published Apr 19, 2024 • 38
SEED-X: Multimodal Models with Unified Multi-granularity Comprehension and Generation Paper • 2404.14396 • Published Apr 22, 2024 • 18
JetMoE: Reaching Llama2 Performance with 0.1M Dollars Paper • 2404.07413 • Published Apr 11, 2024 • 36
Mixture-of-Depths: Dynamically allocating compute in transformer-based language models Paper • 2404.02258 • Published Apr 2, 2024 • 104
Gecko: Versatile Text Embeddings Distilled from Large Language Models Paper • 2403.20327 • Published Mar 29, 2024 • 47
The Unreasonable Ineffectiveness of the Deeper Layers Paper • 2403.17887 • Published Mar 26, 2024 • 78