Files changed (1) hide show
  1. README.md +114 -0
README.md ADDED
@@ -0,0 +1,114 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: MaziyarPanahi/Llama-3-8B-Instruct-DPO-v0.2
3
+ library_name: transformers
4
+ tags:
5
+ - axolotl
6
+ - finetune
7
+ - facebook
8
+ - meta
9
+ - pytorch
10
+ - llama
11
+ - llama-3
12
+ language:
13
+ - en
14
+ pipeline_tag: text-generation
15
+ license: other
16
+ license_name: llama3
17
+ license_link: LICENSE
18
+ inference: false
19
+ model_creator: MaziyarPanahi
20
+ model_name: Llama-3-8B-Instruct-v0.2
21
+ quantized_by: MaziyarPanahi
22
+ ---
23
+
24
+ <img src="./llama-3-merges.webp" alt="Llama-3 DPO Logo" width="500" style="margin-left:'auto' margin-right:'auto' display:'block'"/>
25
+
26
+
27
+ # Llama-3-8B-Instruct-v0.2
28
+
29
+ This model was developed based on `MaziyarPanahi/Llama-3-8B-Instruct-DPO` series.
30
+
31
+ # Quantized GGUF
32
+
33
+ All GGUF models are available here: [MaziyarPanahi/Llama-3-8B-Instruct-v0.2-GGUF](https://huggingface.co/MaziyarPanahi/Llama-3-8B-Instruct-v0.2-GGUF)
34
+
35
+
36
+ # Prompt Template
37
+
38
+ This model uses `ChatML` prompt template:
39
+
40
+ ```
41
+ <|im_start|>system
42
+ {System}
43
+ <|im_end|>
44
+ <|im_start|>user
45
+ {User}
46
+ <|im_end|>
47
+ <|im_start|>assistant
48
+ {Assistant}
49
+ ````
50
+
51
+ # How to use
52
+
53
+ You can use this model by using `MaziyarPanahi/Llama-3-8B-Instruct-v0.2` as the model name in Hugging Face's
54
+ transformers library.
55
+
56
+ ```python
57
+ from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer
58
+ from transformers import pipeline
59
+ import torch
60
+
61
+ model_id = "MaziyarPanahi/Llama-3-8B-Instruct-v0.2"
62
+
63
+ model = AutoModelForCausalLM.from_pretrained(
64
+ model_id,
65
+ torch_dtype=torch.bfloat16,
66
+ device_map="auto",
67
+ trust_remote_code=True,
68
+ # attn_implementation="flash_attention_2"
69
+ )
70
+
71
+ tokenizer = AutoTokenizer.from_pretrained(
72
+ model_id,
73
+ trust_remote_code=True
74
+ )
75
+
76
+ streamer = TextStreamer(tokenizer)
77
+
78
+ pipeline = pipeline(
79
+ "text-generation",
80
+ model=model,
81
+ tokenizer=tokenizer,
82
+ model_kwargs={"torch_dtype": torch.bfloat16},
83
+ streamer=streamer
84
+ )
85
+
86
+ # Then you can use the pipeline to generate text.
87
+
88
+ messages = [
89
+ {"role": "system", "content": "You are a pirate chatbot who always responds in pirate speak!"},
90
+ {"role": "user", "content": "Who are you?"},
91
+ ]
92
+
93
+ prompt = tokenizer.apply_chat_template(
94
+ messages,
95
+ tokenize=False,
96
+ add_generation_prompt=True
97
+ )
98
+
99
+ terminators = [
100
+ tokenizer.eos_token_id,
101
+ tokenizer.convert_tokens_to_ids("<|im_end|>"),
102
+ tokenizer.convert_tokens_to_ids("<|eot_id|>")
103
+ ]
104
+
105
+ outputs = pipeline(
106
+ prompt,
107
+ max_new_tokens=512,
108
+ eos_token_id=terminators,
109
+ do_sample=True,
110
+ temperature=0.6,
111
+ top_p=0.95,
112
+ )
113
+ print(outputs[0]["generated_text"][len(prompt):])
114
+ ```