File size: 480 Bytes
362d372
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
a29e824
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
---
license: llama3.1
library_name: ggml
---

# Meta-Llama-3.1-405B-Instruct-GGUF

Low bit quantizations of Meta's Llama 3.1 405B Instruct model. Quantized from ollama q4_0 GGUF.

**Quants:**
- Q2_K
- (imatrix)
- Q3_K_M
- Q3_K_S
- Q3_K_L
- Q4_K_M
- Q4_0
- Q4_K_S

## imatrix

Experimental, force quanted to iq1_m, then an imatrix is generated and quanted to iq1_m again, and that is used to generate the final imatrix for all quants.

imatrix calibration data: `groups_merged.dat`