mobiuslabsgmbh
/

Llama-2-7b-chat-hf_2bitgs8_hqq

Text Generation

Model card Files Files and versions Community

mobicham commited on Mar 27, 2024

Commit

4c87327

·

verified ·

1 Parent(s): 990a6b0

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -23,13 +23,13 @@ The adapter was trained via SFT on random subsets of the following:
 * <a href="https://huggingface.co/datasets/HuggingFaceH4/ultrafeedback_binarized"> HuggingFaceH4/ultrafeedback_binarized </a> (10K - chosen answers only)
 ## Performance
-| Models            | Llama2-7B (fp16)| Llama2-7B (HQQ-2bit)| Llama2-7B (HQQ+-2bit)| Quip# (2bit)|
 |-------------------|------------------|------------------|------------------|------------------|
 | Wiki Perpexlity   | 5.18             |             6.06 |      <b>5.14</b> |         8.54     |
 | VRAM (GB)         |    13.5          |      <b>2.6</b>  |    2.69          |         2.72     |
 | forward time (sec)|   <b>0.1<b>      |    0.221         |     0.27         |      0.353       |
-| Models            | Llama2-7B-chat (fp16)| Llama2-7B-chat (HQQ-2bit)| Llama2-7B-chat (HQQ+-2bit)|
 |-------------------|------------------|------------------|------------------|
 | ARC (25-shot)     |    53.67         |  45.56   |  47.01  |
 | HellaSwag (10-shot)|   78.56         |  73.59   |  73.74  |

 * <a href="https://huggingface.co/datasets/HuggingFaceH4/ultrafeedback_binarized"> HuggingFaceH4/ultrafeedback_binarized </a> (10K - chosen answers only)
 ## Performance
+| Models            | Llama2-7B (fp16)| Llama2-7B (HQQ 2-bit)| Llama2-7B (HQQ+ 2-bit)| Quip# (2-bit)|
 |-------------------|------------------|------------------|------------------|------------------|
 | Wiki Perpexlity   | 5.18             |             6.06 |      <b>5.14</b> |         8.54     |
 | VRAM (GB)         |    13.5          |      <b>2.6</b>  |    2.69          |         2.72     |
 | forward time (sec)|   <b>0.1<b>      |    0.221         |     0.27         |      0.353       |
+| Models            | Llama2-7B-chat (fp16)| Llama2-7B-chat (HQQ 2-bit)| Llama2-7B-chat (HQQ+ 2-bit)|
 |-------------------|------------------|------------------|------------------|
 | ARC (25-shot)     |    53.67         |  45.56   |  47.01  |
 | HellaSwag (10-shot)|   78.56         |  73.59   |  73.74  |