-
MaskLLM: Learnable Semi-Structured Sparsity for Large Language Models
Paper • 2409.17481 • Published • 47 -
Reducing the Footprint of Multi-Vector Retrieval with Minimal Performance Impact via Token Pooling
Paper • 2409.14683 • Published • 10 -
Discovering the Gems in Early Layers: Accelerating Long-Context LLMs with 1000x Input Token Reduction
Paper • 2409.17422 • Published • 25 -
Emu3: Next-Token Prediction is All You Need
Paper • 2409.18869 • Published • 94
Tian Chen
NekoNekoLover
AI & ML interests
None yet
Organizations
None yet
Collections
1
models
None public yet
datasets
None public yet