Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference Paper β’ 2412.13663 β’ Published 17 days ago β’ 116
Open Whisper-style Speech Models (OWSM) Collection Fully open Whisper-style speech foundation models developed by CMU WAVLab: https://www.wavlab.org/activities/2024/owsm/ β’ 15 items β’ Updated Sep 27, 2024 β’ 4
CommonCrawl Collection Large web-mined general corpus based on CommonCrawl. β’ 7 items β’ Updated 26 days ago β’ 1
AfriCOMET Collection COMET evaluation models for African languages β’ 6 items β’ Updated Oct 1, 2024 β’ 1
MobileLLM Collection Optimizing Sub-billion Parameter Language Models for On-Device Use Cases (ICML 2024) https://arxiv.org/abs/2402.14905 β’ 9 items β’ Updated Nov 27, 2024 β’ 101
Optimized ONNX models for NVIDIA RTX GPUs Collection Collection of optimized ONNX model checkpoints for NVIDIA RTX GPUs β’ 7 items β’ Updated Nov 18, 2024 β’ 10
Spaces for Model / Space / useful Utilities in Hugging Face Collection 239 items β’ Updated 2 days ago β’ 7
EMMA-500: Enhancing Massively Multilingual Adaptation of Large Language Models Paper β’ 2409.17892 β’ Published Sep 26, 2024 β’ 2
view article Article wHy DoNt YoU jUsT uSe ThE lLaMa ToKeNiZeR?? By catherinearnett β’ Sep 27, 2024 β’ 38
Faith and Fate: Limits of Transformers on Compositionality Paper β’ 2305.18654 β’ Published May 29, 2023 β’ 6
π» Local SmolLMs Collection SmolLM models in MLC, ONNX and GGUF format for local applications + in-browser demos β’ 14 items β’ Updated 12 days ago β’ 47
πͺ SmolLM Collection A series of smol LLMs: 135M, 360M and 1.7B. We release base and Instruct models as well as the training corpus and some WebGPU demos β’ 12 items β’ Updated 12 days ago β’ 206
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits Paper β’ 2402.17764 β’ Published Feb 27, 2024 β’ 605
Parakeet Collection NeMo Parakeet ASR Models attain strong speech recognition accuracy while being efficient for inference. Available in CTC and RNN-Transducer variants. β’ 8 items β’ Updated Oct 1, 2024 β’ 20
OLMo Suite Collection Artifacts for the first set of OLMo models. β’ 18 items β’ Updated Nov 27, 2024 β’ 70
MADLAD-400: A Multilingual And Document-Level Large Audited Dataset Paper β’ 2309.04662 β’ Published Sep 9, 2023 β’ 22