alexmarques
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -1,6 +1,13 @@
|
|
1 |
---
|
2 |
language:
|
3 |
- en
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
4 |
pipeline_tag: text-generation
|
5 |
license: llama3.1
|
6 |
---
|
@@ -14,15 +21,15 @@ license: llama3.1
|
|
14 |
- **Model Optimizations:**
|
15 |
- **Activation quantization:** INT8
|
16 |
- **Weight quantization:** INT8
|
17 |
-
- **Intended Use Cases:** Intended for commercial and research use
|
18 |
-
- **Out-of-scope:** Use in any manner that violates applicable laws or regulations (including trade compliance laws).
|
19 |
- **Release Date:** 7/11/2024
|
20 |
- **Version:** 1.0
|
21 |
-
- **License(s):** [Llama3]
|
22 |
- **Model Developers:** Neural Magic
|
23 |
|
24 |
Quantized version of [Meta-Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct).
|
25 |
-
It achieves
|
26 |
|
27 |
### Model Optimizations
|
28 |
|
@@ -120,14 +127,9 @@ model.save_pretrained("Meta-Llama-3.1-8B-Instruct-quantized.w8a8")
|
|
120 |
|
121 |
## Evaluation
|
122 |
|
123 |
-
The model was evaluated on
|
124 |
-
|
125 |
-
|
126 |
-
--model vllm \
|
127 |
-
--model_args pretrained="neuralmagic/Meta-Llama-3.1-8B-Instruct-quantized.w8a8",dtype=auto,gpu_memory_utilization=0.4,add_bos_token=True,max_model_len=4096,tensor_parallel_size=1 \
|
128 |
-
--tasks openllm \
|
129 |
-
--batch_size auto
|
130 |
-
```
|
131 |
|
132 |
### Accuracy
|
133 |
|
@@ -156,21 +158,21 @@ lm_eval \
|
|
156 |
<tr>
|
157 |
<td>ARC Challenge (25-shot)
|
158 |
</td>
|
159 |
-
<td>
|
160 |
</td>
|
161 |
-
<td>
|
162 |
</td>
|
163 |
-
<td>
|
164 |
</td>
|
165 |
</tr>
|
166 |
<tr>
|
167 |
<td>GSM-8K (5-shot, strict-match)
|
168 |
</td>
|
169 |
-
<td>
|
170 |
</td>
|
171 |
-
<td>
|
172 |
</td>
|
173 |
-
<td>
|
174 |
</td>
|
175 |
</tr>
|
176 |
<tr>
|
@@ -206,11 +208,11 @@ lm_eval \
|
|
206 |
<tr>
|
207 |
<td><strong>Average</strong>
|
208 |
</td>
|
209 |
-
<td><strong>
|
210 |
</td>
|
211 |
-
<td><strong>
|
212 |
</td>
|
213 |
-
<td><strong>99.
|
214 |
</td>
|
215 |
</tr>
|
216 |
</table>
|
|
|
1 |
---
|
2 |
language:
|
3 |
- en
|
4 |
+
- de
|
5 |
+
- fr
|
6 |
+
- it
|
7 |
+
- pt
|
8 |
+
- hi
|
9 |
+
- es
|
10 |
+
- th
|
11 |
pipeline_tag: text-generation
|
12 |
license: llama3.1
|
13 |
---
|
|
|
21 |
- **Model Optimizations:**
|
22 |
- **Activation quantization:** INT8
|
23 |
- **Weight quantization:** INT8
|
24 |
+
- **Intended Use Cases:** Intended for commercial and research use multiple languages. Similarly to [Meta-Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct), this models is intended for assistant-like chat.
|
25 |
+
- **Out-of-scope:** Use in any manner that violates applicable laws or regulations (including trade compliance laws).
|
26 |
- **Release Date:** 7/11/2024
|
27 |
- **Version:** 1.0
|
28 |
+
- **License(s):** [Llama3.1]
|
29 |
- **Model Developers:** Neural Magic
|
30 |
|
31 |
Quantized version of [Meta-Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct).
|
32 |
+
It achieves scores within 1.3% of the scores of the unquantized model for MMLU, ARC-Challenge, GSM-8k, Hellaswag, Winogrande and TruthfulQA.
|
33 |
|
34 |
### Model Optimizations
|
35 |
|
|
|
127 |
|
128 |
## Evaluation
|
129 |
|
130 |
+
The model was evaluated on MMLU, ARC-Challenge, GSM-8K, Hellaswag, Winogrande and TruthfulQA.
|
131 |
+
Evaluation was conducted using the Neural Magic fork of [lm-evaluation-harness](https://github.com/neuralmagic/lm-evaluation-harness/tree/llama_3.1_instruct) (branch llama_3.1_instruct) and the [vLLM](https://docs.vllm.ai/en/stable/) engine.
|
132 |
+
This version of the lm-evaluation-harness includes versions of ARC-Challenge and GSM-8K that match the prompting style of [Meta-Llama-3.1-Instruct-evals](https://huggingface.co/datasets/meta-llama/Meta-Llama-3.1-8B-Instruct-evals).
|
|
|
|
|
|
|
|
|
|
|
133 |
|
134 |
### Accuracy
|
135 |
|
|
|
158 |
<tr>
|
159 |
<td>ARC Challenge (25-shot)
|
160 |
</td>
|
161 |
+
<td>83.19
|
162 |
</td>
|
163 |
+
<td>82.08
|
164 |
</td>
|
165 |
+
<td>98.7%
|
166 |
</td>
|
167 |
</tr>
|
168 |
<tr>
|
169 |
<td>GSM-8K (5-shot, strict-match)
|
170 |
</td>
|
171 |
+
<td>82.79
|
172 |
</td>
|
173 |
+
<td>81.96
|
174 |
</td>
|
175 |
+
<td>99.0%
|
176 |
</td>
|
177 |
</tr>
|
178 |
<tr>
|
|
|
208 |
<tr>
|
209 |
<td><strong>Average</strong>
|
210 |
</td>
|
211 |
+
<td><strong>74.31</strong>
|
212 |
</td>
|
213 |
+
<td><strong>73.79</strong>
|
214 |
</td>
|
215 |
+
<td><strong>99.3%</strong>
|
216 |
</td>
|
217 |
</tr>
|
218 |
</table>
|