Update README.md
Browse files
README.md
CHANGED
@@ -12,7 +12,11 @@ license: other
|
|
12 |
license_name: llama3
|
13 |
license_link: LICENSE
|
14 |
---
|
15 |
-
|
|
|
|
|
|
|
|
|
16 |
GGUF [llama.cpp](https://github.com/ggerganov/llama.cpp) quantized version of:
|
17 |
- Original model: [Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct)
|
18 |
- Model creator: [Meta](https://huggingface.co/meta-llama)
|
|
|
12 |
license_name: llama3
|
13 |
license_link: LICENSE
|
14 |
---
|
15 |
+
### Update May, 01 2024
|
16 |
+
Re-uploaded the models with the latest fixes (stopping and [pre-tokenization](https://github.com/ggerganov/llama.cpp/pull/6920)) + additional [imatrix](https://github.com/ggerganov/llama.cpp/discussions/5263#discussioncomment-8395384) versions.
|
17 |
+
Just download a quant with imatrix in the name.<br>
|
18 |
+
Do not use the old ones (the ones with_temp_token_fix). I'll leave them online for now for comparison.
|
19 |
+
---
|
20 |
GGUF [llama.cpp](https://github.com/ggerganov/llama.cpp) quantized version of:
|
21 |
- Original model: [Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct)
|
22 |
- Model creator: [Meta](https://huggingface.co/meta-llama)
|