SanjiWatsuki
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -33,3 +33,28 @@ license: cc-by-nc-4.0
|
|
33 |
| [openchat/openchat_3.5](https://huggingface.co/openchat/openchat_3.5) | 51.34 | 42.67 | 72.92 | 47.27 | 42.51 |
|
34 |
| [berkeley-nest/Starling-LM-7B-alpha](https://huggingface.co/berkeley-nest/Starling-LM-7B-alpha) | 51.16 | 42.06 | 72.72 | 47.33 | 42.53 |
|
35 |
| [HuggingFaceH4/zephyr-7b-beta](https://huggingface.co/HuggingFaceH4/zephyr-7b-beta) | 50.99 | 37.33 | 71.83 | 55.1 | 39.7 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
33 |
| [openchat/openchat_3.5](https://huggingface.co/openchat/openchat_3.5) | 51.34 | 42.67 | 72.92 | 47.27 | 42.51 |
|
34 |
| [berkeley-nest/Starling-LM-7B-alpha](https://huggingface.co/berkeley-nest/Starling-LM-7B-alpha) | 51.16 | 42.06 | 72.72 | 47.33 | 42.53 |
|
35 |
| [HuggingFaceH4/zephyr-7b-beta](https://huggingface.co/HuggingFaceH4/zephyr-7b-beta) | 50.99 | 37.33 | 71.83 | 55.1 | 39.7 |
|
36 |
+
|
37 |
+
| Model | AlpacaEval2 | Length |
|
38 |
+
| --------------------------- | ----------- | ------ |
|
39 |
+
| GPT-4 | 23.58% | 1365 |
|
40 |
+
| GPT-4 0314 | 22.07% | 1371 |
|
41 |
+
| Mistral Medium | 21.86% | 1500 |
|
42 |
+
| Mixtral 8x7B v0.1 | 18.26% | 1465 |
|
43 |
+
| **Kunoichi-DPO-v2** | **17.19%** | 1785 |
|
44 |
+
| Claude 2 | 17.19% | 1069 |
|
45 |
+
| Claude | 16.99% | 1082 |
|
46 |
+
| Gemini Pro | 16.85% | 1315 |
|
47 |
+
| GPT-4 0613 | 15.76% | 1140 |
|
48 |
+
| Claude 2.1 | 15.73% | 1096 |
|
49 |
+
| Mistral 7B v0.2 | 14.72% | 1676 |
|
50 |
+
| GPT 3.5 Turbo 0613 | 14.13% | 1328 |
|
51 |
+
| LLaMA2 Chat 70B | 13.87% | 1790 |
|
52 |
+
| LMCocktail-10.7B-v1 | 13.15% | 1203 |
|
53 |
+
| WizardLM 13B V1.1 | 11.23% | 1525 |
|
54 |
+
| Zephyr 7B Beta | 10.99% | 1444 |
|
55 |
+
| OpenHermes-2.5-Mistral (7B) | 10.34% | 1107 |
|
56 |
+
| GPT 3.5 Turbo 0301 | 9.62% | 827 |
|
57 |
+
| GPT 3.5 Turbo 1106 | 9.18% | 796 |
|
58 |
+
| GPT-3.5 | 8.56% | 1018 |
|
59 |
+
| Phi-2 DPO | 7.76% | 1687 |
|
60 |
+
| LLaMA2 Chat 13B | 7.70% | 1513 |
|