leafspark
/

Meta-Llama-3.1-405B-Instruct-GGUF

Text Generation

Inference Endpoints

Model card Files Files and versions Community

leafspark commited on Jul 24, 2024

Commit

60feb0f

·

verified ·

1 Parent(s): 4945608

readme: update model card

Files changed (1) hide show

README.md +2 -0

README.md CHANGED Viewed

@@ -29,6 +29,8 @@ Quantized with llama.cpp [b3449](https://github.com/ggerganov/llama.cpp/releases
 - Q4_0
 - Q4_K_S
 ## imatrix
 Generated from Q2_K quant.

 - Q4_0
 - Q4_K_S
+For higher quality quantizations (q4+), please refer to [nisten/meta-405b-instruct-cpu-optimized-gguf](https://huggingface.co/nisten/meta-405b-instruct-cpu-optimized-gguf).
 ## imatrix
 Generated from Q2_K quant.