maanqing's picture

maanqing

madroid

·

AI & ML interests

None yet

Recent Activity

liked a model 13 days ago

Qwen/QVQ-72B-Preview

updated a dataset 21 days ago

madroid/glaive-function-calling-openai

liked a model 25 days ago

NousResearch/Hermes-3-Llama-3.2-3B

View all activity

Organizations

madroid's activity

upvoted a collection 2 months ago

MobileLLM

Optimizing Sub-billion Parameter Language Models for On-Device Use Cases (ICML 2024) https://arxiv.org/abs/2402.14905 • 9 items • Updated Nov 27, 2024 • 101

upvoted an article 4 months ago

Article

Tool Use, Unified

Aug 12, 2024

• 70

upvoted 3 collections 4 months ago

HF SmolLM

A series of smol LLMs: 135M, 360M and 1.7B. • 18 items • Updated Jul 16, 2024 • 2

Qwen2-VL

Vision-language model series based on Qwen2 • 16 items • Updated Dec 6, 2024 • 186

GLM-4

GLM-4 Open Models • 13 items • Updated Nov 27, 2024 • 116

upvoted a collection 5 months ago

Probably function calling datasets

Created using the https://huggingface.co/spaces/librarian-bots/dataset-column-search-api Space. • 39 items • Updated Jul 17, 2024 • 37

upvoted an article 7 months ago

Article

Fine-tuning Florence-2 - Microsoft's Cutting-edge Vision Language Models

Jun 24, 2024

• 181

upvoted a collection 7 months ago

Florence

9 items • Updated Jul 11, 2024 • 162

upvoted 3 papers 10 months ago

MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training

Paper • 2403.09611 • Published Mar 14, 2024 • 125

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 605

MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases

Paper • 2402.14905 • Published Feb 22, 2024 • 126

upvoted a collection 11 months ago

Qwen1.5

Qwen1.5 is the improved version of Qwen, the large language model series developed by Alibaba Cloud. • 55 items • Updated Nov 28, 2024 • 205

upvoted a paper 11 months ago

Rethinking Optimization and Architecture for Tiny Language Models

Paper • 2402.02791 • Published Feb 5, 2024 • 12

upvoted 2 papers about 1 year ago

PanGu-π: Enhancing Language Model Architectures via Nonlinearity Compensation

Paper • 2312.17276 • Published Dec 27, 2023 • 15

MobileVLM : A Fast, Reproducible and Strong Vision Language Assistant for Mobile Devices

Paper • 2312.16886 • Published Dec 28, 2023 • 19

upvoted 3 collections about 1 year ago

Handbook v0.1 models and datasets

Models and datasets for v0.1 of the alignment handbook • 6 items • Updated Nov 10, 2023 • 24

Whisper Release

Whisper includes both English-only and multilingual checkpoints for ASR and ST, ranging from 38M params for the tiny models to 1.5B params for large. • 12 items • Updated Sep 13, 2023 • 92

Distil-Whisper Models

The first version of the Distil-Whisper models released with the Distil-Whisper paper. • 4 items • Updated Mar 21, 2024 • 36

upvoted a paper about 1 year ago

Zephyr: Direct Distillation of LM Alignment

Paper • 2310.16944 • Published Oct 25, 2023 • 123

upvoted a collection over 1 year ago

Text to Music 🎧

A collection of music generation models supported in 🤗 Transformers and 🧨 Diffusers • 5 items • Updated Sep 16, 2023 • 3