cloudyu commited on
Commit
73ff794
·
1 Parent(s): dc1c917

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +69 -0
README.md ADDED
@@ -0,0 +1,69 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: cc-by-nc-4.0
3
+ ---
4
+
5
+ # Mixtral MOE 2x10.7B
6
+
7
+
8
+
9
+ MoE of the following models by powerful mergekit :
10
+
11
+
12
+ * [kyujinpy/Sakura-SOLAR-Instruct](https://huggingface.co/kyujinpy/Sakura-SOLAR-Instruct)
13
+ * [jeonsworld/CarbonVillain-en-10.7B-v1](https://huggingface.co/jeonsworld/CarbonVillain-en-10.7B-v1)
14
+
15
+
16
+
17
+
18
+ gpu code example
19
+
20
+ ```
21
+ import torch
22
+ from transformers import AutoTokenizer, AutoModelForCausalLM
23
+ import math
24
+
25
+ ## v2 models
26
+ model_path = "cloudyu/Mixtral_11Bx2_MoE_19B"
27
+
28
+ tokenizer = AutoTokenizer.from_pretrained(model_path, use_default_system_prompt=False)
29
+ model = AutoModelForCausalLM.from_pretrained(
30
+ model_path, torch_dtype=torch.float32, device_map='auto',local_files_only=False, load_in_4bit=True
31
+ )
32
+ print(model)
33
+ prompt = input("please input prompt:")
34
+ while len(prompt) > 0:
35
+ input_ids = tokenizer(prompt, return_tensors="pt").input_ids.to("cuda")
36
+
37
+ generation_output = model.generate(
38
+ input_ids=input_ids, max_new_tokens=500,repetition_penalty=1.2
39
+ )
40
+ print(tokenizer.decode(generation_output[0]))
41
+ prompt = input("please input prompt:")
42
+ ```
43
+
44
+ CPU example
45
+
46
+ ```
47
+ import torch
48
+ from transformers import AutoTokenizer, AutoModelForCausalLM
49
+ import math
50
+
51
+ ## v2 models
52
+ model_path = "cloudyu/Mixtral_11Bx2_MoE_19B"
53
+
54
+ tokenizer = AutoTokenizer.from_pretrained(model_path, use_default_system_prompt=False)
55
+ model = AutoModelForCausalLM.from_pretrained(
56
+ model_path, torch_dtype=torch.float32, device_map='cpu',local_files_only=False
57
+ )
58
+ print(model)
59
+ prompt = input("please input prompt:")
60
+ while len(prompt) > 0:
61
+ input_ids = tokenizer(prompt, return_tensors="pt").input_ids
62
+
63
+ generation_output = model.generate(
64
+ input_ids=input_ids, max_new_tokens=500,repetition_penalty=1.2
65
+ )
66
+ print(tokenizer.decode(generation_output[0]))
67
+ prompt = input("please input prompt:")
68
+
69
+ ```