Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time Paper • 2310.17157 • Published Oct 26, 2023 • 12
CATS: Contextually-Aware Thresholding for Sparsity in Large Language Models Paper • 2404.08763 • Published Apr 12, 2024 • 2
Sirius: Contextual Sparsity with Correction for Efficient LLMs Paper • 2409.03856 • Published Sep 5, 2024 • 1