jakiAJK
/

DeepSeek-R1-Distill-Llama-8B_AWQ

Text Generation

text-generation-inference

Inference Endpoints

4-bit precision

Model card Files Files and versions Community

DeepSeek-R1-Distill-Llama-8B_AWQ / README.md

jakiAJK's picture

Update README.md

57360e8 verified 1 day ago

|

history blame contribute delete

989 Bytes

	---
	library_name: transformers
	base_model:
	- deepseek-ai/DeepSeek-R1-Distill-Llama-8B
	---

	### Requirements
	```python
	pip install -U transformers autoawq
	```

	#### Transformers inference


	```python
	from transformers import AutoTokenizer, AutoModelForCausalLM
	import torch

	dtype = torch.bfloat16 if torch.cuda.is_bf16_supported() else torch.float16
	device = "auto"

	model_name = "jakiAJK/DeepSeek-R1-Distill-Llama-8B_AWQ"
	tokenizer = AutoTokenizer.from_pretrained(model_name)
	model = AutoModelForCausalLM.from_pretrained(model_name, device_map= device, trust_remote_code= True, torch_dtype= dtype)

	model.eval()

	chat = [
	{ "role": "user", "content": "List any 5 country capitals." },
	]
	chat = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)

	input_tokens = tokenizer(chat, return_tensors="pt").to('cuda')

	output = model.generate(**input_tokens,
	max_new_tokens=100)

	output = tokenizer.batch_decode(output)

	print(output)
	```