Update README.md
Browse files
README.md
CHANGED
@@ -69,23 +69,23 @@ Details of the files provided:
|
|
69 |
* Command to create:
|
70 |
* `python3 llama.py vicuna-13B-1.1-HF c4 --wbits 4 --true-sequential --act-order --groupsize 128 --save_safetensors vicuna-13B-1.1-GPTQ-4bit-128g.safetensors`
|
71 |
|
72 |
-
##
|
73 |
|
74 |
-
File `vicuna-13B-1.1-GPTQ-4bit-128g.no-act-order.pt` can be loaded the same as any other GPTQ file, without requiring any updates to [oobaboogas text-generation-webui](https://github.com/oobabooga/text-generation-webui).
|
75 |
|
76 |
-
|
77 |
|
78 |
-
|
79 |
-
```
|
80 |
-
# We need to clone GPTQ-for-LLaMa as of April 13th, due to breaking changes in more recent commits
|
81 |
-
git clone -n https://github.com/qwopqwop200/GPTQ-for-LLaMa gptq-safe
|
82 |
-
cd gptq-safe && git checkout 58c8ab4c7aaccc50f507fd08cce941976affe5e0
|
83 |
|
84 |
-
|
|
|
|
|
85 |
git clone https://github.com/oobabooga/text-generation-webui
|
86 |
-
#
|
87 |
-
mkdir
|
88 |
-
|
|
|
|
|
89 |
```
|
90 |
|
91 |
Then install this model into `text-generation-webui/models` and launch the UI as follows:
|
|
|
69 |
* Command to create:
|
70 |
* `python3 llama.py vicuna-13B-1.1-HF c4 --wbits 4 --true-sequential --act-order --groupsize 128 --save_safetensors vicuna-13B-1.1-GPTQ-4bit-128g.safetensors`
|
71 |
|
72 |
+
## Manual instructions for `text-generation-webui`
|
73 |
|
74 |
+
File `vicuna-13B-1.1-GPTQ-4bit-128g.compat.no-act-order.pt` can be loaded the same as any other GPTQ file, without requiring any updates to [oobaboogas text-generation-webui](https://github.com/oobabooga/text-generation-webui).
|
75 |
|
76 |
+
[Instructions on using GPTQ 4bit files in text-generation-webui are here](https://github.com/oobabooga/text-generation-webui/wiki/GPTQ-models-\(4-bit-mode\)).
|
77 |
|
78 |
+
The other `safetensors` model file was created using `--act-order` to give the maximum possible quantisation quality, but this means it requires that the latest GPTQ-for-LLaMa is used inside the UI.
|
|
|
|
|
|
|
|
|
79 |
|
80 |
+
If you want to use the act-order `safetensors` files and need to update the Triton branch of GPTQ-for-LLaMa, here are the commands I used to clone the Triton branch of GPTQ-for-LLaMa, clone text-generation-webui, and install GPTQ into the UI:
|
81 |
+
```
|
82 |
+
# Clone text-generation-webui, if you don't already have it
|
83 |
git clone https://github.com/oobabooga/text-generation-webui
|
84 |
+
# Make a repositories directory
|
85 |
+
mkdir text-generation-webui/repositories
|
86 |
+
cd text-generation-webui/repositories
|
87 |
+
# Clone the latest GPTQ-for-LLaMa code inside text-generation-webui
|
88 |
+
git clone https://github.com/qwopqwop200/GPTQ-for-LLaMa
|
89 |
```
|
90 |
|
91 |
Then install this model into `text-generation-webui/models` and launch the UI as follows:
|