gronkomatic commited on
Commit
3f30ef0
·
verified ·
1 Parent(s): bfb5f6e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -55
README.md CHANGED
@@ -1,5 +1,5 @@
1
  ---
2
- license: openrail
3
  datasets:
4
  - teknium/OpenHermes-2.5
5
  language:
@@ -9,8 +9,7 @@ pipeline_tag: text-generation
9
  ---
10
  # Model Card for neoncortex/mini-mistral-openhermes-2.5-chatml-test
11
 
12
- A tiny Mistral model trained on teknium/OpenHermes-2.5.
13
- This is epoch 5/9, so still some training to go.
14
 
15
  ## Model Details
16
 
@@ -40,7 +39,7 @@ So, here's the bits:
40
  {%- if message['role'] == 'system' -%}
41
  {{- '<|im_start|>system\n' + message['content'].rstrip() + '<|im_end|>\n' -}}
42
  {%- else -%}
43
- {%- if message['role'] == 'user' -%}
44
  {{-'<|im_start|>human\n' + message['content'].rstrip() + '<|im_end|>\n'-}}
45
  {%- else -%}
46
  {{-'<|im_start|>assistant\n' + message['content'] + '<|im_end|>\n' -}}
@@ -71,36 +70,10 @@ Exclusively available right here on HuggingFace!
71
 
72
  If you wanna have a laugh at how bad it is then go ahead, but I wouldn't expect much from it.
73
 
74
- ### Direct Use
75
-
76
- <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
77
-
78
- [More Information Needed]
79
-
80
- ### Downstream Use [optional]
81
-
82
- <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
83
-
84
- [More Information Needed]
85
-
86
  ### Out-of-Scope Use
87
 
88
  This model won't work well for pretty much everything, probably.
89
 
90
- [More Information Needed]
91
-
92
- ## Bias, Risks, and Limitations
93
-
94
- <!-- This section is meant to convey both technical and sociotechnical limitations. -->
95
-
96
- [More Information Needed]
97
-
98
- ### Recommendations
99
-
100
- <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
101
-
102
- Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
103
-
104
  ## How to Get Started with the Model
105
 
106
  Use the code below to get started with the model.
@@ -121,11 +94,11 @@ Use the code below to get started with the model.
121
 
122
  #### Preprocessing
123
 
124
- I took the OpenHermes 2.5 dataset formatted it with ChatML.
125
 
126
  #### Training Hyperparameters
127
 
128
- - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
129
 
130
  #### Speeds, Sizes, Times
131
 
@@ -134,10 +107,6 @@ steps: 140976
134
  batches per device: 6
135
  1.04it/s
136
 
137
- <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
138
-
139
- [More Information Needed]
140
-
141
  ## Evaluation
142
 
143
  I tried to run evals but the eval suite just laughed at me.
@@ -148,21 +117,11 @@ Don't be rude.
148
 
149
  ## Environmental Impact
150
 
151
- <!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
152
-
153
- Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
154
-
155
  - **Hardware Type:** I already told you. Try and keep up.
156
  - **Hours used:** ~45 x 2 I guess.
157
  - **Cloud Provider:** gronkomatic
158
  - **Compute Region:** myob
159
- - **Carbon Emitted:** Probably
160
-
161
- ## Technical Specifications
162
-
163
- ### Model Architecture and Objective
164
-
165
- [More Information Needed]
166
 
167
  ### Compute Infrastructure
168
 
@@ -176,14 +135,6 @@ I trained it on my PC with no side on it because I like to watch the GPUs do the
176
 
177
  The wonderful free stuff at HuggingFace (https://huggingface.co)[https://huggingface.co]: transformers, datasets, trl
178
 
179
- ## Glossary
180
-
181
- IDGAF - I don't give a fuck
182
-
183
- ## More Information
184
-
185
- [More Information Needed]
186
-
187
  ## Model Card Authors
188
 
189
  gronkomatic, unless you're offended by something, in which case it was hacked by hackers.
 
1
  ---
2
+ license: apache-2.0
3
  datasets:
4
  - teknium/OpenHermes-2.5
5
  language:
 
9
  ---
10
  # Model Card for neoncortex/mini-mistral-openhermes-2.5-chatml-test
11
 
12
+ A tiny Mistral model trained as an experiment on teknium/OpenHermes-2.5.
 
13
 
14
  ## Model Details
15
 
 
39
  {%- if message['role'] == 'system' -%}
40
  {{- '<|im_start|>system\n' + message['content'].rstrip() + '<|im_end|>\n' -}}
41
  {%- else -%}
42
+ {%- if message['role'] == 'human' -%}
43
  {{-'<|im_start|>human\n' + message['content'].rstrip() + '<|im_end|>\n'-}}
44
  {%- else -%}
45
  {{-'<|im_start|>assistant\n' + message['content'] + '<|im_end|>\n' -}}
 
70
 
71
  If you wanna have a laugh at how bad it is then go ahead, but I wouldn't expect much from it.
72
 
 
 
 
 
 
 
 
 
 
 
 
 
73
  ### Out-of-Scope Use
74
 
75
  This model won't work well for pretty much everything, probably.
76
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
77
  ## How to Get Started with the Model
78
 
79
  Use the code below to get started with the model.
 
94
 
95
  #### Preprocessing
96
 
97
+ I took the OpenHermes 2.5 dataset and formatted it with ChatML.
98
 
99
  #### Training Hyperparameters
100
 
101
+ - **Training regime:** bf16 mixed precision
102
 
103
  #### Speeds, Sizes, Times
104
 
 
107
  batches per device: 6
108
  1.04it/s
109
 
 
 
 
 
110
  ## Evaluation
111
 
112
  I tried to run evals but the eval suite just laughed at me.
 
117
 
118
  ## Environmental Impact
119
 
 
 
 
 
120
  - **Hardware Type:** I already told you. Try and keep up.
121
  - **Hours used:** ~45 x 2 I guess.
122
  - **Cloud Provider:** gronkomatic
123
  - **Compute Region:** myob
124
+ - **Carbon Emitted:** Yes, definitely
 
 
 
 
 
 
125
 
126
  ### Compute Infrastructure
127
 
 
135
 
136
  The wonderful free stuff at HuggingFace (https://huggingface.co)[https://huggingface.co]: transformers, datasets, trl
137
 
 
 
 
 
 
 
 
 
138
  ## Model Card Authors
139
 
140
  gronkomatic, unless you're offended by something, in which case it was hacked by hackers.