Edit Models filters

Inference status

Misc

Inference Endpoints

AutoTrain Compatible

text-generation-inference

4-bit precision

8-bit precision

Misc with no match

text-embeddings-inference

Carbon Emissions

Mixture of Experts

Models

74

Full-text search

Active filters: Quantization

VPTQ-community/Llama-3.1-Nemotron-70B-Instruct-HF-v8-k65536-65536-woft

Updated Nov 18, 2024 • 3 • 4

VPTQ-community/Llama-3.1-Nemotron-70B-Instruct-HF-v16-k65536-1024-woft

Updated Nov 18, 2024 • 5

VPTQ-community/Llama-3.1-Nemotron-70B-Instruct-HF-v8-k65536-0-woft

Updated Nov 18, 2024 • 1

VPTQ-community/Llama-3.1-Nemotron-70B-Instruct-HF-v16-k65536-16384-woft

Updated Nov 18, 2024 • 2

VPTQ-community/Llama-3.1-Nemotron-70B-Instruct-HF-v16-k65536-256-woft

Updated Nov 18, 2024 • 1 • 1

mit-han-lab/svdquant-models

Text-to-Image • Updated Nov 30, 2024 • 874 • 64

Puhaha/gemma-2-9b-it-SimPO_q4_k_m

Updated Nov 16, 2024

VPTQ-community/Meta-Llama-3.3-70B-Instruct-v8-k65536-256-woft

Updated Dec 15, 2024 • 6

VPTQ-community/Meta-Llama-3.3-70B-Instruct-v16-k65536-16384-woft

Updated Dec 15, 2024 • 15

VPTQ-community/Meta-Llama-3.3-70B-Instruct-v8-k65536-0-woft

Updated Dec 15, 2024 • 19

VPTQ-community/Meta-Llama-3.3-70B-Instruct-v16-k65536-65536-woft

Updated Dec 15, 2024 • 5

VPTQ-community/Meta-Llama-3.3-70B-Instruct-v8-k65536-65536-woft

Updated Dec 15, 2024 • 11 • 1

VPTQ-community/Meta-Llama-3.3-70B-Instruct-v16-k65536-1024-woft

Updated Dec 15, 2024 • 14 • 1

VPTQ-community/Meta-Llama-3.1-8B-Instruct-v12-k65536-4096-woft-vllm

Updated 8 days ago • 6