mrm8488 commited on
Commit
634b04b
·
1 Parent(s): 4e334a5

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +96 -0
README.md ADDED
@@ -0,0 +1,96 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: wtfpl
3
+ datasets:
4
+ - HuggingFaceH4/no_robots
5
+ pipeline_tag: text-generation
6
+ ---
7
+
8
+ # MAMBA (2.8B) 🐍 fine-tuned on OpenHerms
9
+
10
+ Model Card is still WIP!
11
+
12
+
13
+ ## Base model info
14
+
15
+ Mamba is a new state space model architecture showing promising performance on information-dense data such as language modeling, where previous subquadratic models fall short of Transformers.
16
+ It is based on the line of progress on [structured state space models](https://github.com/state-spaces/s4),
17
+ with an efficient hardware-aware design and implementation in the spirit of [FlashAttention](https://github.com/Dao-AILab/flash-attention).
18
+
19
+ ## Dataset info
20
+
21
+ TBA
22
+
23
+
24
+ ## Usage
25
+
26
+ ```sh
27
+ pip install transformers
28
+ pip install causal-conv1d<=1.0.2
29
+ pip install mamba-ssm
30
+ ```
31
+
32
+ ```py
33
+ import torch
34
+ from transformers import AutoTokenizer, AutoModelForCausalLM
35
+ from mamba_ssm.models.mixer_seq_simple import MambaLMHeadModel
36
+
37
+ CHAT_TEMPLATE_ID = "HuggingFaceH4/zephyr-7b-beta"
38
+
39
+ device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
40
+ model_name = "clibrain/mamba-2.8b-instruct-openhermes"
41
+
42
+ eos_token = "<|endoftext|>"
43
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
44
+ tokenizer.eos_token = eos_token
45
+ tokenizer.pad_token = tokenizer.eos_token
46
+ tokenizer.chat_template = AutoTokenizer.from_pretrained(CHAT_TEMPLATE_ID).chat_template
47
+
48
+ model = MambaLMHeadModel.from_pretrained(
49
+ model_name, device=device, dtype=torch.float16)
50
+
51
+ history_dict: list[dict[str, str]] = []
52
+ prompt = "Tell me 5 sites to visit in Spain"
53
+ history_dict.append(dict(role="user", content=prompt))
54
+
55
+ input_ids = tokenizer.apply_chat_template(
56
+ history_dict, return_tensors="pt", add_generation_prompt=True
57
+ ).to(device)
58
+
59
+ out = model.generate(
60
+ input_ids=input_ids,
61
+ max_length=2000,
62
+ temperature=0.9,
63
+ top_p=0.7,
64
+ eos_token_id=tokenizer.eos_token_id,
65
+ )
66
+
67
+ decoded = tokenizer.batch_decode(out)
68
+ assistant_message = (
69
+ decoded[0].split("<|assistant|>\n")[-1].replace(eos, "")
70
+ )
71
+
72
+ print(assistant_message)
73
+ ```
74
+
75
+
76
+ ## Gradio Demo
77
+
78
+ ```sh
79
+ git clone https://github.com/mrm8488/mamba-chat.git
80
+ cd mamba-chat
81
+
82
+ pip install -r requirements.txt
83
+ pip install -q gradio==4.8.0
84
+
85
+ python app.py \
86
+ --model clibrain/mamba-2.8b-chat-no_robots \
87
+ --share
88
+ ```
89
+ ## Evaluations
90
+
91
+ Coming soon!
92
+
93
+
94
+ ## Acknowledgments
95
+
96
+ Thanks to [mamba-chat](https://github.com/havenhq/mamba-chat/tree/main) for heavily inspiring our work