Edit Models filters

Inference status

Misc

8-bit precision

Inference Endpoints

AutoTrain Compatible

text-generation-inference

4-bit precision

Mixture of Experts

text-embeddings-inference

Carbon Emissions

Models

9,716

Full-text search

Active filters: 8-bit

brunopio/Llama3-8B-1.58-100B-tokens-GGUF

Text Generation • Updated Sep 19, 2024 • 989k • 12

Qwen/Qwen2.5-Coder-7B-Instruct-GPTQ-Int8

Text Generation • Updated Nov 18, 2024 • 779 • 3

MaziyarPanahi/Llama-3.2-3B-Instruct-GGUF

Text Generation • Updated Sep 25, 2024 • 2.27M • 8

neuralmagic/Llama-3.2-1B-Instruct-quantized.w8a8

Text Generation • Updated Oct 16, 2024 • 3.87k • 7

MikeRoz/ArliAI_Mistral-Small-22B-ArliAI-RPMax-v1.1-8.0bpw-h8-exl2

Updated Sep 26, 2024 • 30 • 1

FuturisticVibes/Tiger-Gemma-9B-v3-8.0bpw-h8-exl2

Updated Oct 5, 2024 • 32 • 2

altomek/Llama-3.2-3B-Instruct-8bpw-EXL2

Text Generation • Updated Oct 8, 2024 • 15 • 1

bigstorm/Llama-3.1-Nemotron-70B-Instruct-HF-8.0bpw-8hb-exl2

Text Generation • Updated Oct 16, 2024 • 26 • 3

mlx-community/Ministral-8B-Instruct-2410-8bit

Updated Oct 17, 2024 • 45 • 2

nejumi/Llama-3.1-Nemotron-70B-Instruct-HF-GPTQ-Int8-calib-ja-1k

Text Generation • Updated Oct 30, 2024 • 17 • 1

MaziyarPanahi/Meraj-Mini-GGUF

Text Generation • Updated Oct 17, 2024 • 13.1k • 2

mlx-community/airoboros-33b-gpt4-1.4

Updated Oct 19, 2024 • 10 • 1

MaziyarPanahi/llama-3.2-Korean-Bllossom-3B-GGUF

Text Generation • Updated Oct 20, 2024 • 380 • 1

MaziyarPanahi/Llama-3.2-3B-Instruct-uncensored-GGUF

Text Generation • Updated Oct 20, 2024 • 133 • 3

MaziyarPanahi/Llama-3.2-3B-Overthinker-GGUF

Text Generation • Updated Oct 20, 2024 • 92 • 1

MaziyarPanahi/Llama-3.2-3B-Instruct-abliterated-GGUF

Text Generation • Updated Oct 20, 2024 • 291 • 1

MaziyarPanahi/Llama-3.1-8B-Lexi-Uncensored-V2-GGUF

Text Generation • Updated Oct 20, 2024 • 190 • 2

MaziyarPanahi/Llama3.1-8B-Chinese-Chat-GGUF

Text Generation • Updated Oct 20, 2024 • 91 • 1

mlx-community/dolphin-2.9.4-llama3.1-8b-8bit

Updated Oct 21, 2024 • 52 • 3

luigi86/magnum-v4-22b_mlx-8bit

Text Generation • Updated Oct 23, 2024 • 17 • 1

malenia1/ternary-weight-embedding

Updated 21 days ago • 895 • 2

luigi86/magnum-v4-9b_mlx-8bit

Text Generation • Updated Oct 23, 2024 • 17 • 1

luigi86/magnum-v4-12b_mlx-8bit

Text Generation • Updated Oct 23, 2024 • 14 • 1

luigi86/magnum-v4-27b_mlx-8bit

Text Generation • Updated Oct 23, 2024 • 19 • 1

MaziyarPanahi/granite-3.0-8b-instruct-GGUF

Text Generation • Updated Oct 24, 2024 • 152 • 1

tb423/internlm2_5-7b-chat-gptq-c4-8bit

Feature Extraction • Updated Oct 24, 2024 • 42 • 1

eugenehp/Llama3-8B-1.58-100B-tokens-GGUF

Text Generation • Updated Oct 24, 2024 • 175 • 2

mlx-community/aya-expanse-32b-8bit

Text Generation • Updated Oct 25, 2024 • 94 • 1

luigi86/UnslopNemo-12B-v4.1_mlx-8bit

Updated Oct 25, 2024 • 7 • 1

luigi86/UnslopNemo-12B-v4_mlx-8bit

Updated Oct 25, 2024 • 5 • 1