Update README.md
Browse files
README.md
CHANGED
@@ -136,6 +136,32 @@ Here are the evaluation results for DCLM-Baseline-7B on various tasks (using [ll
|
|
136 |
Note: All scores are presented as decimal values between 0 and 1, representing the proportion of correct answers or the model's performance on each task.
|
137 |
|
138 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
139 |
|
140 |
## Limitations and Biases
|
141 |
|
|
|
136 |
Note: All scores are presented as decimal values between 0 and 1, representing the proportion of correct answers or the model's performance on each task.
|
137 |
|
138 |
|
139 |
+
## Comparison
|
140 |
+
|
141 |
+
|
142 |
+
Below are comparisions of this model with other models in the 7B regime.
|
143 |
+
|
144 |
+
| Model | Params | Tokens | Open dataset? | CORE | MMLU | EXTENDED |
|
145 |
+
|---------------|--------|--------|---------------|----------|----------|----------|
|
146 |
+
| **Open weights, closed datasets** | | | | | | |
|
147 |
+
| Llama2 | 7B | 2T | β | 49.2 | 45.8 | 34.1 |
|
148 |
+
| DeepSeek | 7B | 2T | β | 50.7 | 48.5 | 35.3 |
|
149 |
+
| Mistral-0.3 | 7B | ? | β | 57.0 | 62.7 | 45.1 |
|
150 |
+
| QWEN-2 | 7B | ? | β | 57.5 | **71.9** | 50.5 |
|
151 |
+
| Llama3 | 8B | 15T | β | 57.6 | 66.2 | 46.3 |
|
152 |
+
| Gemma | 8B | 6T | β | 57.8 | 64.3 | 44.6 |
|
153 |
+
| Phi-3 | 7B | ? | β | **61.0** | 69.9 | **57.9** |
|
154 |
+
| **Open weights, open datasets** | | | | | | |
|
155 |
+
| Falcon | 7B | 1T | β | 44.1 | 27.4 | 25.1 |
|
156 |
+
| OLMo-1.7 | 7B | 2.1T | β | 47.0 | 54.0 | 34.2 |
|
157 |
+
| MAP-Neo | 7B | 4.5T | β | **50.2** | **57.1** | **40.4** |
|
158 |
+
| **Models we trained** | | | | | | |
|
159 |
+
| FineWeb edu | 7B | 0.14T | β | 38.7 | 26.3 | 22.1 |
|
160 |
+
| FineWeb edu | 7B | 0.28T | β | 41.9 | 37.3 | 24.5 |
|
161 |
+
| **DCLM-7B** | 7B | 2.5T | β | **57.1** | **63.7** | **45.4** |
|
162 |
+
|
163 |
+
|
164 |
+
|
165 |
|
166 |
## Limitations and Biases
|
167 |
|