|
--- |
|
license: llama3.1 |
|
base_model: |
|
- meta-llama/Llama-3.1-405B-Instruct |
|
--- |
|
Llama 3.1 405B Quants and llama.cpp versions that is used for quantization |
|
- IQ1_S: 86.8 GB - b3459 |
|
- IQ1_M: 95.1 GB - b3459 |
|
- IQ2_XXS: 109.0 GB - b3459 |
|
- IQ3_XXS: 157.7 GB - b3484 |
|
|
|
Quantization from BF16 here: |
|
https://huggingface.co/nisten/meta-405b-instruct-cpu-optimized-gguf/ |
|
|
|
which is converted from Llama 3.1 405B: |
|
https://huggingface.co/meta-llama/Meta-Llama-3.1-405B-Instruct |
|
|
|
imatrix file https://huggingface.co/nisten/meta-405b-instruct-cpu-optimized-gguf/blob/main/405imatrix.dat |
|
|
|
Lmk if you need bigger quants. |
|
|
|
Sponsored by: https://pickabrain.ai |