readme: add model card
Browse files
README.md
CHANGED
@@ -1,3 +1,24 @@
|
|
1 |
-
---
|
2 |
-
license: llama3.1
|
3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: llama3.1
|
3 |
+
library_name: ggml
|
4 |
+
---
|
5 |
+
|
6 |
+
# Meta-Llama-3.1-405B-Instruct-GGUF
|
7 |
+
|
8 |
+
Low bit quantizations of Meta's Llama 3.1 405B Instruct model. Quantized from ollama q4_0 GGUF.
|
9 |
+
|
10 |
+
**Quants:**
|
11 |
+
- Q2_K
|
12 |
+
- (imatrix)
|
13 |
+
- Q3_K_M
|
14 |
+
- Q3_K_S
|
15 |
+
- Q3_K_L
|
16 |
+
- Q4_K_M
|
17 |
+
- Q4_0
|
18 |
+
- Q4_K_S
|
19 |
+
|
20 |
+
imatrix calibration data: groups_merged.dat
|
21 |
+
|
22 |
+
## imatrix
|
23 |
+
|
24 |
+
Experimental, force quanted to iq1_m, then an imatrix is generated and quanted to iq1_m again, and that is used to generate the final imatrix for all quants.
|