Upload folder using huggingface_hub
Browse files
eval.txt
CHANGED
@@ -10,4 +10,16 @@
|
|
10 |
|---|---|
|
11 |
| Single turn | 5.93 |
|
12 |
| Multi turn | 5.52 |
|
13 |
-
| Overall | 5.73 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
10 |
|---|---|
|
11 |
| Single turn | 5.93 |
|
12 |
| Multi turn | 5.52 |
|
13 |
+
| Overall | 5.73 |
|
14 |
+
|
15 |
+
|
16 |
+
| Tasks |Version| Filter |n-shot| Metric | |Value | |Stderr|
|
17 |
+
|--------|------:|----------------|-----:|-----------------------|---|-----:|---|------|
|
18 |
+
|gsm8k | 3|flexible-extract| 5|exact_match |↑ |0.7013|± |0.0126|
|
19 |
+
| | |strict-match | 5|exact_match |↑ |0.2418|± |0.0118|
|
20 |
+
|gsm8k-ko| 1|flexible-extract| 5|exact_match |↑ |0.4466|± |0.0137|
|
21 |
+
| | |strict-match | 5|exact_match |↑ |0.4420|± |0.0137|
|
22 |
+
|ifeval | 4|none | 0|inst_level_loose_acc |↑ |0.8549|± | N/A|
|
23 |
+
| | |none | 0|inst_level_strict_acc |↑ |0.8225|± | N/A|
|
24 |
+
| | |none | 0|prompt_level_loose_acc |↑ |0.7874|± |0.0176|
|
25 |
+
| | |none | 0|prompt_level_strict_acc|↑ |0.7468|± |0.0187|
|