MiniPLM: Knowledge Distillation for Pre-Training Language Models Paper • 2410.17215 • Published Oct 22, 2024 • 14 • 2
Data Selection via Optimal Control for Language Models Paper • 2410.07064 • Published Oct 9, 2024 • 8 • 2