Jamie@TitanML
commited on
Commit
·
79a467c
1
Parent(s):
3e6dd14
Upload folder using huggingface_hub
Browse files- .gitattributes +0 -1
- README.md +213 -0
- config.json +25 -0
- ct_output_models/config.json +6 -0
- ct_output_models/model.bin +3 -0
- ct_output_models/vocabulary.json +0 -0
- generation_config.json +6 -0
- special_tokens_map.json +5 -0
- tokenizer.json +0 -0
- tokenizer_config.json +9 -0
.gitattributes
CHANGED
@@ -25,7 +25,6 @@
|
|
25 |
*.safetensors filter=lfs diff=lfs merge=lfs -text
|
26 |
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
27 |
*.tar.* filter=lfs diff=lfs merge=lfs -text
|
28 |
-
*.tar filter=lfs diff=lfs merge=lfs -text
|
29 |
*.tflite filter=lfs diff=lfs merge=lfs -text
|
30 |
*.tgz filter=lfs diff=lfs merge=lfs -text
|
31 |
*.wasm filter=lfs diff=lfs merge=lfs -text
|
|
|
25 |
*.safetensors filter=lfs diff=lfs merge=lfs -text
|
26 |
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
27 |
*.tar.* filter=lfs diff=lfs merge=lfs -text
|
|
|
28 |
*.tflite filter=lfs diff=lfs merge=lfs -text
|
29 |
*.tgz filter=lfs diff=lfs merge=lfs -text
|
30 |
*.wasm filter=lfs diff=lfs merge=lfs -text
|
README.md
ADDED
@@ -0,0 +1,213 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: apache-2.0
|
3 |
+
language:
|
4 |
+
- en
|
5 |
+
datasets:
|
6 |
+
- togethercomputer/RedPajama-Data-1T
|
7 |
+
- OpenAssistant/oasst1
|
8 |
+
- databricks/databricks-dolly-15k
|
9 |
+
widget:
|
10 |
+
- text: "<human>: Write an email to my friends inviting them to come to my home on Friday for a dinner party, bring their own food to share.\n<bot>:"
|
11 |
+
example_title: "Email Writing"
|
12 |
+
- text: "<human>: Create a list of things to do in San Francisco\n<bot>:"
|
13 |
+
example_title: "Brainstorming"
|
14 |
+
inference:
|
15 |
+
parameters:
|
16 |
+
temperature: 0.7
|
17 |
+
top_p: 0.7
|
18 |
+
top_k: 50
|
19 |
+
max_new_tokens: 128
|
20 |
+
---
|
21 |
+
|
22 |
+
# RedPajama-INCITE-7B-Chat
|
23 |
+
|
24 |
+
RedPajama-INCITE-7B-Chat was developed by Together and leaders from the open-source AI community including Ontocord.ai, ETH DS3Lab, AAI CERC, Université de Montréal, MILA - Québec AI Institute, Stanford Center for Research on Foundation Models (CRFM), Stanford Hazy Research research group and LAION.
|
25 |
+
|
26 |
+
It is fine-tuned on OASST1 and Dolly2 to enhance chatting ability.
|
27 |
+
|
28 |
+
- Base Model: [RedPajama-INCITE-7B-Base](https://huggingface.co/togethercomputer/RedPajama-INCITE-7B-Base)
|
29 |
+
- Instruction-tuned Version: [RedPajama-INCITE-7B-Instruct](https://huggingface.co/togethercomputer/RedPajama-INCITE-7B-Instruct)
|
30 |
+
- Chat Version: [RedPajama-INCITE-7B-Chat](https://huggingface.co/togethercomputer/RedPajama-INCITE-7B-Chat)
|
31 |
+
|
32 |
+
|
33 |
+
## Model Details
|
34 |
+
- **Developed by**: Together Computer.
|
35 |
+
- **Model type**: Language Model
|
36 |
+
- **Language(s)**: English
|
37 |
+
- **License**: Apache 2.0
|
38 |
+
- **Model Description**: A 6.9B parameter pretrained language model.
|
39 |
+
|
40 |
+
# Quick Start
|
41 |
+
|
42 |
+
Please note that the model requires `transformers` version >= 4.25.1.
|
43 |
+
|
44 |
+
To prompt the chat model, use the following format:
|
45 |
+
```
|
46 |
+
<human>: [Instruction]
|
47 |
+
<bot>:
|
48 |
+
```
|
49 |
+
|
50 |
+
## GPU Inference
|
51 |
+
|
52 |
+
This requires a GPU with 16GB memory.
|
53 |
+
|
54 |
+
```python
|
55 |
+
import torch
|
56 |
+
import transformers
|
57 |
+
from transformers import AutoTokenizer, AutoModelForCausalLM
|
58 |
+
|
59 |
+
MIN_TRANSFORMERS_VERSION = '4.25.1'
|
60 |
+
|
61 |
+
# check transformers version
|
62 |
+
assert transformers.__version__ >= MIN_TRANSFORMERS_VERSION, f'Please upgrade transformers to version {MIN_TRANSFORMERS_VERSION} or higher.'
|
63 |
+
|
64 |
+
# init
|
65 |
+
tokenizer = AutoTokenizer.from_pretrained("togethercomputer/RedPajama-INCITE-7B-Chat")
|
66 |
+
model = AutoModelForCausalLM.from_pretrained("togethercomputer/RedPajama-INCITE-7B-Chat", torch_dtype=torch.float16)
|
67 |
+
model = model.to('cuda:0')
|
68 |
+
# infer
|
69 |
+
prompt = "<human>: Who is Alan Turing?\n<bot>:"
|
70 |
+
inputs = tokenizer(prompt, return_tensors='pt').to(model.device)
|
71 |
+
input_length = inputs.input_ids.shape[1]
|
72 |
+
outputs = model.generate(
|
73 |
+
**inputs, max_new_tokens=128, do_sample=True, temperature=0.7, top_p=0.7, top_k=50, return_dict_in_generate=True
|
74 |
+
)
|
75 |
+
token = outputs.sequences[0, input_length:]
|
76 |
+
output_str = tokenizer.decode(token)
|
77 |
+
print(output_str)
|
78 |
+
"""
|
79 |
+
Alan Mathison Turing (23 June 1912 7 June 1954) was an English computer scientist, mathematician, logician, cryptanalyst, philosopher, mathematician, and theoretical biologist.
|
80 |
+
"""
|
81 |
+
```
|
82 |
+
|
83 |
+
## GPU Inference in Int8
|
84 |
+
|
85 |
+
This requires a GPU with 12GB memory.
|
86 |
+
|
87 |
+
To run inference with int8, please ensure you have installed accelerate and bitandbytes. You can install them with the following command:
|
88 |
+
|
89 |
+
```bash
|
90 |
+
pip install accelerate
|
91 |
+
pip install bitsandbytes
|
92 |
+
```
|
93 |
+
|
94 |
+
Then you can run inference with int8 as follows:
|
95 |
+
|
96 |
+
```python
|
97 |
+
import torch
|
98 |
+
import transformers
|
99 |
+
from transformers import AutoTokenizer, AutoModelForCausalLM
|
100 |
+
|
101 |
+
MIN_TRANSFORMERS_VERSION = '4.25.1'
|
102 |
+
|
103 |
+
# check transformers version
|
104 |
+
assert transformers.__version__ >= MIN_TRANSFORMERS_VERSION, f'Please upgrade transformers to version {MIN_TRANSFORMERS_VERSION} or higher.'
|
105 |
+
|
106 |
+
# init
|
107 |
+
tokenizer = AutoTokenizer.from_pretrained("togethercomputer/RedPajama-INCITE-7B-Chat")
|
108 |
+
model = AutoModelForCausalLM.from_pretrained("togethercomputer/RedPajama-INCITE-7B-Chat", device_map='auto', torch_dtype=torch.float16, load_in_8bit=True)
|
109 |
+
|
110 |
+
# infer
|
111 |
+
prompt = "<human>: Who is Alan Turing?\n<bot>:"
|
112 |
+
inputs = tokenizer(prompt, return_tensors='pt').to(model.device)
|
113 |
+
input_length = inputs.input_ids.shape[1]
|
114 |
+
outputs = model.generate(
|
115 |
+
**inputs, max_new_tokens=128, do_sample=True, temperature=0.7, top_p=0.7, top_k=50, return_dict_in_generate=True
|
116 |
+
)
|
117 |
+
token = outputs.sequences[0, input_length:]
|
118 |
+
output_str = tokenizer.decode(token)
|
119 |
+
print(output_str)
|
120 |
+
"""
|
121 |
+
Alan Mathison Turing (23 June 1912 – 7 June 1954) was an English computer scientist, mathematician, logician, cryptanalyst, philosopher, and theoretical biologist.
|
122 |
+
"""
|
123 |
+
```
|
124 |
+
|
125 |
+
## CPU Inference
|
126 |
+
|
127 |
+
```python
|
128 |
+
import torch
|
129 |
+
import transformers
|
130 |
+
from transformers import AutoTokenizer, AutoModelForCausalLM
|
131 |
+
|
132 |
+
MIN_TRANSFORMERS_VERSION = '4.25.1'
|
133 |
+
|
134 |
+
# check transformers version
|
135 |
+
assert transformers.__version__ >= MIN_TRANSFORMERS_VERSION, f'Please upgrade transformers to version {MIN_TRANSFORMERS_VERSION} or higher.'
|
136 |
+
|
137 |
+
# init
|
138 |
+
tokenizer = AutoTokenizer.from_pretrained("togethercomputer/RedPajama-INCITE-7B-Chat")
|
139 |
+
model = AutoModelForCausalLM.from_pretrained("togethercomputer/RedPajama-INCITE-7B-Chat", torch_dtype=torch.bfloat16)
|
140 |
+
# infer
|
141 |
+
prompt = "<human>: Who is Alan Turing?\n<bot>:"
|
142 |
+
inputs = tokenizer(prompt, return_tensors='pt').to(model.device)
|
143 |
+
input_length = inputs.input_ids.shape[1]
|
144 |
+
outputs = model.generate(
|
145 |
+
**inputs, max_new_tokens=128, do_sample=True, temperature=0.7, top_p=0.7, top_k=50, return_dict_in_generate=True
|
146 |
+
)
|
147 |
+
token = outputs.sequences[0, input_length:]
|
148 |
+
output_str = tokenizer.decode(token)
|
149 |
+
print(output_str)
|
150 |
+
"""
|
151 |
+
Alan Mathison Turing, OBE, FRS, (23 June 1912 – 7 June 1954) was an English computer scientist, mathematician, logician, cryptanalyst, philosopher, and theoretical biologist.
|
152 |
+
"""
|
153 |
+
```
|
154 |
+
|
155 |
+
Please note that since `LayerNormKernelImpl` is not implemented in fp16 for CPU, we use `bfloat16` for CPU inference.
|
156 |
+
|
157 |
+
|
158 |
+
# Uses
|
159 |
+
|
160 |
+
## Direct Use
|
161 |
+
|
162 |
+
Excluded uses are described below.
|
163 |
+
|
164 |
+
### Misuse, Malicious Use, and Out-of-Scope Use
|
165 |
+
|
166 |
+
It is the responsibility of the end user to ensure that the model is used in a responsible and ethical manner.
|
167 |
+
|
168 |
+
#### Out-of-Scope Use
|
169 |
+
|
170 |
+
`RedPajama-INCITE-7B-Chat` is a language model and may not perform well for other use cases outside of its intended scope.
|
171 |
+
For example, it may not be suitable for use in safety-critical applications or for making decisions that have a significant impact on individuals or society.
|
172 |
+
It is important to consider the limitations of the model and to only use it for its intended purpose.
|
173 |
+
|
174 |
+
#### Misuse and Malicious Use
|
175 |
+
|
176 |
+
`RedPajama-INCITE-7B-Chat` is designed for language modeling.
|
177 |
+
Misuse of the model, such as using it to engage in illegal or unethical activities, is strictly prohibited and goes against the principles of the project.
|
178 |
+
|
179 |
+
Using the model to generate content that is cruel to individuals is a misuse of this model. This includes, but is not limited to:
|
180 |
+
|
181 |
+
- Generating fake news, misinformation, or propaganda
|
182 |
+
- Promoting hate speech, discrimination, or violence against individuals or groups
|
183 |
+
- Impersonating individuals or organizations without their consent
|
184 |
+
- Engaging in cyberbullying or harassment
|
185 |
+
- Defamatory content
|
186 |
+
- Spamming or scamming
|
187 |
+
- Sharing confidential or sensitive information without proper authorization
|
188 |
+
- Violating the terms of use of the model or the data used to train it
|
189 |
+
- Creating automated bots for malicious purposes such as spreading malware, phishing scams, or spamming
|
190 |
+
|
191 |
+
## Limitations
|
192 |
+
|
193 |
+
`RedPajama-INCITE-7B-Chat`, like other language models, has limitations that should be taken into consideration.
|
194 |
+
For example, the model may not always provide accurate or relevant answers, particularly for questions that are complex, ambiguous, or outside of its training data.
|
195 |
+
We therefore welcome contributions from individuals and organizations, and encourage collaboration towards creating a more robust and inclusive chatbot.
|
196 |
+
|
197 |
+
## Training
|
198 |
+
|
199 |
+
**Training Data**
|
200 |
+
|
201 |
+
Please refer to [togethercomputer/RedPajama-Data-1T](https://huggingface.co/datasets/togethercomputer/RedPajama-Data-1T)
|
202 |
+
|
203 |
+
**Training Procedure**
|
204 |
+
|
205 |
+
- **Hardware:** 8 A100
|
206 |
+
- **Optimizer:** Adam
|
207 |
+
- **Gradient Accumulations**: 1
|
208 |
+
- **Num of Tokens:** 79M tokens
|
209 |
+
- **Learning rate:** 1e-5
|
210 |
+
|
211 |
+
## Community
|
212 |
+
|
213 |
+
Join us on [Together Discord](https://discord.gg/6ZVDU8tTD4)
|
config.json
ADDED
@@ -0,0 +1,25 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"_name_or_path": "togethercomputer/RedPajama-INCITE-Chat-7B-v1",
|
3 |
+
"architectures": [
|
4 |
+
"GPTNeoXForCausalLM"
|
5 |
+
],
|
6 |
+
"bos_token_id": 0,
|
7 |
+
"eos_token_id": 0,
|
8 |
+
"hidden_act": "gelu",
|
9 |
+
"hidden_size": 4096,
|
10 |
+
"initializer_range": 0.02,
|
11 |
+
"intermediate_size": 16384,
|
12 |
+
"layer_norm_eps": 1e-05,
|
13 |
+
"max_position_embeddings": 2048,
|
14 |
+
"model_type": "gpt_neox",
|
15 |
+
"num_attention_heads": 32,
|
16 |
+
"num_hidden_layers": 32,
|
17 |
+
"rotary_emb_base": 10000,
|
18 |
+
"rotary_pct": 1.0,
|
19 |
+
"tie_word_embeddings": false,
|
20 |
+
"torch_dtype": "float16",
|
21 |
+
"transformers_version": "4.28.1",
|
22 |
+
"use_cache": true,
|
23 |
+
"use_parallel_residual": false,
|
24 |
+
"vocab_size": 50432
|
25 |
+
}
|
ct_output_models/config.json
ADDED
@@ -0,0 +1,6 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"bos_token": "<|endoftext|>",
|
3 |
+
"eos_token": "<|endoftext|>",
|
4 |
+
"layer_norm_epsilon": null,
|
5 |
+
"unk_token": "<|endoftext|>"
|
6 |
+
}
|
ct_output_models/model.bin
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:86299d05d8652884aa425ebb6a189f634b28e4f7c8ab60e6b9a5124b3fabd6a8
|
3 |
+
size 6867593490
|
ct_output_models/vocabulary.json
ADDED
The diff for this file is too large to render.
See raw diff
|
|
generation_config.json
ADDED
@@ -0,0 +1,6 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"_from_model_config": true,
|
3 |
+
"bos_token_id": 0,
|
4 |
+
"eos_token_id": 0,
|
5 |
+
"transformers_version": "4.29.1"
|
6 |
+
}
|
special_tokens_map.json
ADDED
@@ -0,0 +1,5 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"bos_token": "<|endoftext|>",
|
3 |
+
"eos_token": "<|endoftext|>",
|
4 |
+
"unk_token": "<|endoftext|>"
|
5 |
+
}
|
tokenizer.json
ADDED
The diff for this file is too large to render.
See raw diff
|
|
tokenizer_config.json
ADDED
@@ -0,0 +1,9 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"add_prefix_space": false,
|
3 |
+
"bos_token": "<|endoftext|>",
|
4 |
+
"clean_up_tokenization_spaces": true,
|
5 |
+
"eos_token": "<|endoftext|>",
|
6 |
+
"model_max_length": 2048,
|
7 |
+
"tokenizer_class": "GPTNeoXTokenizer",
|
8 |
+
"unk_token": "<|endoftext|>"
|
9 |
+
}
|