Edit Models filters

Inference status

Misc

compressed-tensors

Inference Endpoints

AutoTrain Compatible

text-generation-inference

8-bit precision

Mixture of Experts

Misc with no match

4-bit precision

text-embeddings-inference

Carbon Emissions

Models

618

Full-text search

Active filters: compressed-tensors

neuralmagic/Meta-Llama-3.1-70B-Instruct-quantized.w8a8

Text Generation • Updated Oct 10, 2024 • 7.81k • 18

cortecs/Llama-3.3-70B-Instruct-FP8-Dynamic

Updated 22 days ago • 13.5k • 5

neuralmagic/Mistral-7B-Instruct-v0.3-quantized.w8a8

Text Generation • Updated Oct 9, 2024 • 362 • 1

neuralmagic/Meta-Llama-3.1-8B-Instruct-FP8

Text Generation • Updated Oct 9, 2024 • 522k • 37

neuralmagic/Meta-Llama-3.1-70B-Instruct-FP8-dynamic

Text Generation • Updated Oct 19, 2024 • 672 • 5

neuralmagic/Meta-Llama-3.1-70B-Instruct-FP8

Text Generation • Updated Oct 9, 2024 • 67.9k • 37

neuralmagic/Meta-Llama-3.1-405B-Instruct-FP8

Text Generation • Updated Oct 9, 2024 • 3.53k • 31

neuralmagic/Meta-Llama-3.1-405B-Instruct-FP8-dynamic

Text Generation • Updated Oct 19, 2024 • 185 • 14

neuralmagic/Meta-Llama-3.1-8B-Instruct-quantized.w8a16

Text Generation • Updated Oct 23, 2024 • 7.65k • 9

neuralmagic/Meta-Llama-3.1-8B-Instruct-quantized.w8a8

Text Generation • Updated Oct 23, 2024 • 6.1k • 13

neuralmagic/Meta-Llama-3.1-70B-Instruct-quantized.w8a16

Text Generation • Updated Oct 9, 2024 • 257 • 4

neuralmagic/Meta-Llama-3.1-70B-FP8

Text Generation • Updated Oct 9, 2024 • 223 • 1

neuralmagic/Meta-Llama-3.1-8B-quantized.w8a8

Text Generation • Updated Oct 23, 2024 • 557 • 2

neuralmagic/Meta-Llama-3.1-405B-Instruct-quantized.w4a16

Text Generation • Updated Oct 10, 2024 • 2.42k • 12

neuralmagic/gemma-2-9b-it-quantized.w4a16

Text Generation • Updated Oct 9, 2024 • 1.54k • 1

neuralmagic/gemma-2-2b-it-quantized.w4a16

Text Generation • Updated Oct 9, 2024 • 958 • 1

neuralmagic/Mistral-Nemo-Instruct-2407-quantized.w4a16

Text Generation • Updated Oct 9, 2024 • 2.47k • 3

neuralmagic/SmolLM-1.7B-Instruct-quantized.w8a8

Text Generation • Updated Oct 9, 2024 • 22 • 1

nm-testing/DeepSeek-V2.5-W4A16

Updated Oct 9, 2024 • 303 • 2

nm-testing/Phi-3.5-MoE-instruct-FP8

Updated Oct 9, 2024 • 16 • 2

neuralmagic/Llama-3.2-11B-Vision-Instruct-FP8-dynamic

Text Generation • Updated Oct 2, 2024 • 49.8k • 17

neuralmagic/Llama-3.2-3B-Instruct-FP8-dynamic

Text Generation • Updated Oct 9, 2024 • 1.16k • 2

neuralmagic/Llama-3.2-1B-Instruct-quantized.w8a8

Text Generation • Updated Oct 16, 2024 • 3.84k • 7

neuralmagic/Llama-3.2-90B-Vision-Instruct-FP8-dynamic

Text Generation • Updated Oct 2, 2024 • 4.26k • 8

neuralmagic/Llama-3.2-1B-Instruct-FP8

Text Generation • Updated Oct 16, 2024 • 6.82k • 2

neuralmagic/Llama-3.2-3B-Instruct-FP8

Text Generation • Updated Oct 16, 2024 • 16.2k • 3

neuralmagic/Phi-3.5-mini-instruct-FP8-KV

Text Generation • Updated Oct 1, 2024 • 213 • 2

nm-testing/NVLM-D-72B-FP8-dynamic

Updated Oct 7, 2024 • 50 • 3

neuralmagic/pixtral-12b-FP8-dynamic

Text Generation • Updated Nov 1, 2024 • 2.78k • 7

SicariusSicariiStuff/LLAMA-3_8B_Unaligned_BETA_FP8

Updated Oct 11, 2024 • 4 • 1