readme: extend results section
Browse files
README.md
CHANGED
@@ -66,11 +66,12 @@ Evaluation is performed with SpanMarkers internal evaluation code that uses `seq
|
|
66 |
the official GermEval 2014 Evaluation Script for double-checking the results. A backup of the `nereval.py` script
|
67 |
can be found [here](https://github.com/bplank/DaNplus/blob/master/scripts/nereval.perl).
|
68 |
|
69 |
-
We fine-tune 5 models and upload the model with best F1-Score on development set
|
|
|
70 |
|
71 |
-
| Model | Run 1
|
72 |
-
| ---------------------- |
|
73 |
-
| GELECTRA Large (5e-05) | 89.99 | 89.55 | 89.60 | 89.34 | 89.68 | 89.63
|
74 |
|
75 |
The best model achieves a final test score of 89.08%:
|
76 |
|
|
|
66 |
the official GermEval 2014 Evaluation Script for double-checking the results. A backup of the `nereval.py` script
|
67 |
can be found [here](https://github.com/bplank/DaNplus/blob/master/scripts/nereval.perl).
|
68 |
|
69 |
+
We fine-tune 5 models and upload the model with best F1-Score on development set. Results on development set are
|
70 |
+
in brackets:
|
71 |
|
72 |
+
| Model | Run 1 | Run 2 | Run 3 | Run 4 | Run 5 | Avg.
|
73 |
+
| ---------------------- | --------------- | --------------- | --------------- | --------------- | --------------- | ---------------
|
74 |
+
| GELECTRA Large (5e-05) | (89.99) / 89.08 | (89.55) / 89.23 | (89.60) / 89.10 | (89.34) / 89.02 | (89.68) / 88.80 | (89.63) / 89.05
|
75 |
|
76 |
The best model achieves a final test score of 89.08%:
|
77 |
|