metadata

license: llama3.1
library_name: ggml

Meta-Llama-3.1-405B-Instruct-GGUF

Low bit quantizations of Meta's Llama 3.1 405B Instruct model. Quantized from ollama q4_0 GGUF.

Quants:

Q2_K
(imatrix)
Q3_K_M
Q3_K_S
Q3_K_L
Q4_K_M
Q4_0
Q4_K_S

imatrix

Experimental, force quanted to iq1_m, then an imatrix is generated and quanted to iq1_m again, and that is used to generate the final imatrix for all quants.

imatrix calibration data: groups_merged.dat