MobileLLM Collection Optimizing Sub-billion Parameter Language Models for On-Device Use Cases (ICML 2024) https://arxiv.org/abs/2402.14905 • 9 items • Updated Nov 27, 2024 • 101
HF SmolLM Collection A series of smol LLMs: 135M, 360M and 1.7B. • 18 items • Updated Jul 16, 2024 • 2
Qwen2-VL Collection Vision-language model series based on Qwen2 • 16 items • Updated Dec 6, 2024 • 186
Probably function calling datasets Collection Created using the https://huggingface.co/spaces/librarian-bots/dataset-column-search-api Space. • 39 items • Updated Jul 17, 2024 • 37
view article Article Fine-tuning Florence-2 - Microsoft's Cutting-edge Vision Language Models Jun 24, 2024 • 181
MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training Paper • 2403.09611 • Published Mar 14, 2024 • 125
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits Paper • 2402.17764 • Published Feb 27, 2024 • 605
MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases Paper • 2402.14905 • Published Feb 22, 2024 • 126
Qwen1.5 Collection Qwen1.5 is the improved version of Qwen, the large language model series developed by Alibaba Cloud. • 55 items • Updated Nov 28, 2024 • 205
Rethinking Optimization and Architecture for Tiny Language Models Paper • 2402.02791 • Published Feb 5, 2024 • 12
PanGu-π: Enhancing Language Model Architectures via Nonlinearity Compensation Paper • 2312.17276 • Published Dec 27, 2023 • 15
MobileVLM : A Fast, Reproducible and Strong Vision Language Assistant for Mobile Devices Paper • 2312.16886 • Published Dec 28, 2023 • 19
Handbook v0.1 models and datasets Collection Models and datasets for v0.1 of the alignment handbook • 6 items • Updated Nov 10, 2023 • 24
Whisper Release Collection Whisper includes both English-only and multilingual checkpoints for ASR and ST, ranging from 38M params for the tiny models to 1.5B params for large. • 12 items • Updated Sep 13, 2023 • 92
Distil-Whisper Models Collection The first version of the Distil-Whisper models released with the Distil-Whisper paper. • 4 items • Updated Mar 21, 2024 • 36
Text to Music 🎧 Collection A collection of music generation models supported in 🤗 Transformers and 🧨 Diffusers • 5 items • Updated Sep 16, 2023 • 3