cicdatopea commited on
Commit
df9474a
·
verified ·
1 Parent(s): 951efd7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -3
README.md CHANGED
@@ -10,14 +10,16 @@ This model is an int4 model with group_size 128 and and symmetric quantization o
10
 
11
  **Please note that loading the model in Transformers can be quite slow. Consider using an alternative serving framework for better performance.**
12
 
13
- Due to limited GPU resources, we have only tested a few prompts on a CPU backend using QBits . If you found this model not perform well, **you can explore a quantized model in AWQ format with different hyperparameters generated via AutoRound** which will be uploaded soon
 
 
14
 
15
  ## How To Use
16
 
17
  ### INT4 Inference
18
 
19
  ````python
20
- from transformers import AutoModelForCausalLM, AutoTokenizer,GenerationConfig
21
  import torch
22
  quantized_model_dir = "OPEA/DeepSeek-V3-int4-sym-gptq-inc-preview"
23
 
@@ -53,7 +55,7 @@ response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
53
  print(response)
54
 
55
 
56
- ## The following result is infercenced on CPU with qbits backend
57
  prompt = "9.11和9.8哪个数字大"
58
 
59
  ##INT4
 
10
 
11
  **Please note that loading the model in Transformers can be quite slow. Consider using an alternative serving framework for better performance.**
12
 
13
+ Due to limited GPU resources, we have only tested a few prompts on a CPU backend using QBits. If you found this modelnot perform well, **you can explore a quantized model in AWQ format with different hyperparameters generated via AutoRound** which will be uploaded soon
14
+
15
+ Please follow the license of the original model.
16
 
17
  ## How To Use
18
 
19
  ### INT4 Inference
20
 
21
  ````python
22
+ from transformers import AutoModelForCausalLM, AutoTokenizer
23
  import torch
24
  quantized_model_dir = "OPEA/DeepSeek-V3-int4-sym-gptq-inc-preview"
25
 
 
55
  print(response)
56
 
57
 
58
+ ## The following result is inferred on CPU with qbits backend
59
  prompt = "9.11和9.8哪个数字大"
60
 
61
  ##INT4