OPEA
/

DeepSeek-V3-int4-sym-gptq-inc

4-bit precision

Model card Files Files and versions Community

cicdatopea commited on 5 days ago

Commit

acad571

·

verified ·

1 Parent(s): 1b23955

Update README.md

Files changed (1) hide show

README.md +0 -43

README.md CHANGED Viewed

@@ -311,49 +311,6 @@ prompt = "Please give a brief introduction of DeepSeek company."
 """DeepSeek Artificial Intelligence Co., Ltd. (referred to as "DeepSeek" or "深度求索") , founded in 2023, is a Chinese company dedicated to making AGI a reality"""
 ~~~
-### INT4 Inference on CUDA(have not tested, maybe need 8X80G GPU)
-Int4 kernel with BF16 computing dtype is required.
-````python
-from transformers import AutoModelForCausalLM, AutoTokenizer
-import torch
-quantized_model_dir = "OPEA/DeepSeek-V3-int4-sym-gptq-inc-preview"
-model = AutoModelForCausalLM.from_pretrained(
-    quantized_model_dir,
-    torch_dtype=torch.float16,
-    trust_remote_code=True,
-    device_map="auto"
-)
-tokenizer = AutoTokenizer.from_pretrained(quantized_model_dir,trust_remote_code=True)
-prompt = "There is a girl who likes adventure,"
-messages = [
-    {"role": "system", "content": "You are a helpful assistant."},
-    {"role": "user", "content": prompt}
-]
-text = tokenizer.apply_chat_template(
-    messages,
-    tokenize=False,
-    add_generation_prompt=True
-)
-model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
-generated_ids = model.generate(
-    model_inputs.input_ids,
-    max_new_tokens=200,  ##change this to align with the official usage
-    do_sample=False  ##change this to align with the official usage
-)
-generated_ids = [
-output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
-]
-response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
-print(response)
-````
 ### Evaluate the model

 """DeepSeek Artificial Intelligence Co., Ltd. (referred to as "DeepSeek" or "深度求索") , founded in 2023, is a Chinese company dedicated to making AGI a reality"""
 ~~~
 ### Evaluate the model