cicdatopea
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -6,11 +6,15 @@ base_model:
|
|
6 |
---
|
7 |
## Model Details
|
8 |
|
9 |
-
This model is an int4 model with group_size 128 and and symmetric quantization of [deepseek-ai/DeepSeek-V3](https://huggingface.co/deepseek-ai/DeepSeek-V3) generated by [intel/auto-round](https://github.com/intel/auto-round) algorithm.
|
|
|
|
|
|
|
|
|
10 |
|
11 |
## How To Use
|
12 |
|
13 |
-
### INT4 Inference
|
14 |
|
15 |
````python
|
16 |
from transformers import AutoModelForCausalLM, AutoTokenizer,GenerationConfig
|
@@ -107,7 +111,7 @@ prompt = "strawberry中有几个r?"
|
|
107 |
8. r - 是r
|
108 |
"""
|
109 |
|
110 |
-
prompt = "strawberry
|
111 |
##INT4
|
112 |
"""The word "strawberry" contains **3 "r"s.
|
113 |
"""
|
|
|
6 |
---
|
7 |
## Model Details
|
8 |
|
9 |
+
This model is an int4 model with group_size 128 and and symmetric quantization of [deepseek-ai/DeepSeek-V3](https://huggingface.co/deepseek-ai/DeepSeek-V3) generated by [intel/auto-round](https://github.com/intel/auto-round) algorithm.
|
10 |
+
|
11 |
+
**Please note that loading the model in Transformers can be quite slow. Consider using an alternative serving framework for better performance.**
|
12 |
+
|
13 |
+
Due to limited GPU resources, we have only tested a few prompts on a CPU backend using QBits . If you found this model not perform well, **you can explore a quantized model in AWQ format with different hyperparameters generated via AutoRound** which will be uploaded soon
|
14 |
|
15 |
## How To Use
|
16 |
|
17 |
+
### INT4 Inference
|
18 |
|
19 |
````python
|
20 |
from transformers import AutoModelForCausalLM, AutoTokenizer,GenerationConfig
|
|
|
111 |
8. r - 是r
|
112 |
"""
|
113 |
|
114 |
+
prompt = "How many r in strawberry."
|
115 |
##INT4
|
116 |
"""The word "strawberry" contains **3 "r"s.
|
117 |
"""
|