neuralmagic/Llama-3.1-Nemotron-70B-Instruct-HF-FP8-dynamic Text Generation • Updated Oct 17, 2024 • 3.97k • 14
FP8 LLMs for vLLM Collection Accurate FP8 quantized models by Neural Magic, ready for use with vLLM! • 44 items • Updated Oct 17, 2024 • 61