etemiz
/

Llama-3.1-405B-Inst-GGUF

Inference Endpoints

Model card Files Files and versions Community

etemiz commited on Jul 26, 2024

Commit

8bd05ec

·

verified ·

1 Parent(s): 91024dd

Update README.md

Files changed (1) hide show

README.md +5 -3

README.md CHANGED Viewed

@@ -2,13 +2,15 @@
 license: llama3.1
 ---
-Requants of BF16 of
 https://huggingface.co/nisten/meta-405b-instruct-cpu-optimized-gguf/
-Which is converted from
 https://huggingface.co/meta-llama/Meta-Llama-3.1-405B-Instruct
 llama.cpp version b3459
-imatrix file https://huggingface.co/nisten/meta-405b-instruct-cpu-optimized-gguf/blob/main/405imatrix.dat

 license: llama3.1
 ---
+Quantization from BF16 here:
 https://huggingface.co/nisten/meta-405b-instruct-cpu-optimized-gguf/
+which is converted from Llama 3.1 405B:
 https://huggingface.co/meta-llama/Meta-Llama-3.1-405B-Instruct
 llama.cpp version b3459
+imatrix file https://huggingface.co/nisten/meta-405b-instruct-cpu-optimized-gguf/blob/main/405imatrix.dat
+Lmk if you need bigger quants.