RedPajama: an Open Dataset for Training Large Language Models Paper • 2411.12372 • Published Nov 19, 2024 • 48
Spectra: A Comprehensive Study of Ternary, Quantized, and FP16 Language Models Paper • 2407.12327 • Published Jul 17, 2024 • 77
$μ$LO: Compute-Efficient Meta-Generalization of Learned Optimizers Paper • 2406.00153 • Published May 31, 2024 • 10
Simple and Scalable Strategies to Continually Pre-train Large Language Models Paper • 2403.08763 • Published Mar 13, 2024 • 49