Upload ./Qwen2-7B-Q8_0.mmlu.pro.txt with huggingface_hub

Browse files

Files changed (1) hide show

Qwen2-7B-Q8_0.mmlu.pro.txt +2 -80

Qwen2-7B-Q8_0.mmlu.pro.txt CHANGED Viewed

@@ -1,80 +1,2 @@
-multiple_choice_score: there are 70 tasks in prompt
-multiple_choice_score: reading tasks......................................................................done
-multiple_choice_score: preparing task data......................................................................done
-multiple_choice_score : calculating TruthfulQA score over 70 tasks.
-task	acc_norm
-1	0.00000000
-2	50.00000000
-3	33.33333333
-4	25.00000000
-5	40.00000000
-6	50.00000000
-7	57.14285714
-8	50.00000000
-9	44.44444444
-10	50.00000000
-11	54.54545455
-12	50.00000000
-13	46.15384615
-14	50.00000000
-15	53.33333333
-16	50.00000000
-17	47.05882353
-18	44.44444444
-19	42.10526316
-20	40.00000000
-21	38.09523810
-22	36.36363636
-23	34.78260870
-24	33.33333333
-25	32.00000000
-26	30.76923077
-27	29.62962963
-28	28.57142857
-29	27.58620690
-30	26.66666667
-31	25.80645161
-32	25.00000000
-33	27.27272727
-34	26.47058824
-35	25.71428571
-36	25.00000000
-37	24.32432432
-38	23.68421053
-39	23.07692308
-40	25.00000000
-41	24.39024390
-42	23.80952381
-43	23.25581395
-44	22.72727273
-45	22.22222222
-46	23.91304348
-47	23.40425532
-48	25.00000000
-49	24.48979592
-50	24.00000000
-51	23.52941176
-52	23.07692308
-53	24.52830189
-54	24.07407407
-55	25.45454545
-56	25.00000000
-57	24.56140351
-58	24.13793103
-59	23.72881356
-60	25.00000000
-61	24.59016393
-62	24.19354839
-63	23.80952381
-64	25.00000000
-65	24.61538462
-66	24.24242424
-67	23.88059701
-68	23.52941176
-69	23.18840580
-70	22.85714286
- Final result: 22.8571 +/- 5.0552
-Random chance: 10.0000 +/- 3.6116


1	+ multiple_choice_score: there are 12032 tasks in prompt
2	+ multiple_choice_score: reading tasksmultiple_choice_score: failed to read task 1 of 12032