Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,25 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# RWKV-6-World-1.6B-GGUF-Q4_K_M
|
2 |
+
|
3 |
+
This repo contains the RWKV-6-World-1.6B-GGUF quantized with the latest llama.cpp(b3651).
|
4 |
+
|
5 |
+
## How to run the model
|
6 |
+
|
7 |
+
* Get the latest llama.cpp:
|
8 |
+
```
|
9 |
+
git clone https://github.com/ggerganov/llama.cpp
|
10 |
+
```
|
11 |
+
|
12 |
+
* Download the GGUF file to a new model folder in llama.cpp(just Q4_K_M for now, maybe I'll Q8 later):
|
13 |
+
```
|
14 |
+
cd llama.cpp
|
15 |
+
mkdir model
|
16 |
+
git clone https://huggingface.co/Lyte/RWKV-6-World-1.6B-GGUF/
|
17 |
+
mv RWKV-6-World-1.6B-GGUF/RWKV-6-World-1.6B-GGUF-Q4_K_M.gguf model/
|
18 |
+
rm -r RWKV-6-World-1.6B-GGUF
|
19 |
+
```
|
20 |
+
* For Windows other than git cloning the repo, you just create the "model" folder inside llama.cpp folder and go to the repo and download the model there.
|
21 |
+
|
22 |
+
* Now to run the model, you can use the following command:
|
23 |
+
```
|
24 |
+
!./llama-cli -m ./model/RWKV-6-World-1.6B-GGUF-Q4_K_M.gguf --in-prefix "\nUser:" --in-suffix "\nAssistant:" --interactive-first -c 1024 -t 0.7 --top-k 40 --top-p 0.95 -n 64 -p "Assistant: Hello, what can i help you with today?\n" -r "User"
|
25 |
+
```
|