Update README.md
Browse files
README.md
CHANGED
@@ -23,13 +23,13 @@ The adapter was trained via SFT on random subsets of the following:
|
|
23 |
* <a href="https://huggingface.co/datasets/HuggingFaceH4/ultrafeedback_binarized"> HuggingFaceH4/ultrafeedback_binarized </a> (10K - chosen answers only)
|
24 |
|
25 |
## Performance
|
26 |
-
| Models | Llama2-7B (fp16)| Llama2-7B (HQQ-
|
27 |
|-------------------|------------------|------------------|------------------|------------------|
|
28 |
| Wiki Perpexlity | 5.18 | 6.06 | <b>5.14</b> | 8.54 |
|
29 |
| VRAM (GB) | 13.5 | <b>2.6</b> | 2.69 | 2.72 |
|
30 |
| forward time (sec)| <b>0.1<b> | 0.221 | 0.27 | 0.353 |
|
31 |
|
32 |
-
| Models | Llama2-7B-chat (fp16)| Llama2-7B-chat (HQQ-
|
33 |
|-------------------|------------------|------------------|------------------|
|
34 |
| ARC (25-shot) | 53.67 | 45.56 | 47.01 |
|
35 |
| HellaSwag (10-shot)| 78.56 | 73.59 | 73.74 |
|
|
|
23 |
* <a href="https://huggingface.co/datasets/HuggingFaceH4/ultrafeedback_binarized"> HuggingFaceH4/ultrafeedback_binarized </a> (10K - chosen answers only)
|
24 |
|
25 |
## Performance
|
26 |
+
| Models | Llama2-7B (fp16)| Llama2-7B (HQQ 2-bit)| Llama2-7B (HQQ+ 2-bit)| Quip# (2-bit)|
|
27 |
|-------------------|------------------|------------------|------------------|------------------|
|
28 |
| Wiki Perpexlity | 5.18 | 6.06 | <b>5.14</b> | 8.54 |
|
29 |
| VRAM (GB) | 13.5 | <b>2.6</b> | 2.69 | 2.72 |
|
30 |
| forward time (sec)| <b>0.1<b> | 0.221 | 0.27 | 0.353 |
|
31 |
|
32 |
+
| Models | Llama2-7B-chat (fp16)| Llama2-7B-chat (HQQ 2-bit)| Llama2-7B-chat (HQQ+ 2-bit)|
|
33 |
|-------------------|------------------|------------------|------------------|
|
34 |
| ARC (25-shot) | 53.67 | 45.56 | 47.01 |
|
35 |
| HellaSwag (10-shot)| 78.56 | 73.59 | 73.74 |
|