stefan-it's picture
readme: fix f1 test score
03d1d56
|
raw
history blame
3.27 kB
metadata
license: cc-by-4.0
library_name: span-marker
tags:
  - span-marker
  - token-classification
  - ner
  - named-entity-recognition
pipeline_tag: token-classification
widget:
  - text: >-
      Jürgen Schmidhuber studierte ab 1983 Informatik und Mathematik an der TU
      München .
    example_title: Wikipedia
datasets:
  - gwlms/germeval2014
language:
  - de
model-index:
  - name: >-
      SpanMarker with GWLMS BERT on GermEval 2014 NER Dataset by Stefan Schweter
      (@stefan-it)
    results:
      - task:
          type: token-classification
          name: Named Entity Recognition
        dataset:
          type: gwlms/germeval2014
          name: GermEval 2014
          split: test
          revision: f3647c56803ce67c08ee8d15f4611054c377b226
        metrics:
          - type: f1
            value: 0.8745
            name: F1
metrics:
  - f1

SpanMarker for GermEval 2014 NER

This is a SpanMarker model that was fine-tuned on the GermEval 2014 NER Dataset.

The GermEval 2014 NER Shared Task builds on a new dataset with German Named Entity annotation with the following properties: The data was sampled from German Wikipedia and News Corpora as a collection of citations. The dataset covers over 31,000 sentences corresponding to over 590,000 tokens. The NER annotation uses the NoSta-D guidelines, which extend the Tübingen Treebank guidelines, using four main NER categories with sub-structure, and annotating embeddings among NEs such as [ORG FC Kickers [LOC Darmstadt]].

12 classes of Named Entites are annotated and must be recognized: four main classes PERson, LOCation, ORGanisation, and OTHer and their subclasses by introducing two fine-grained labels: -deriv marks derivations from NEs such as "englisch" (“English”), and -part marks compounds including a NE as a subsequence deutschlandweit (“Germany-wide”).

Fine-Tuning

We use the same hyper-parameters as used in the "German's Next Language Model" paper using the GWLMS BERT model as backbone.

Evaluation is performed with SpanMarkers internal evaluation code that uses seqeval.

We fine-tune 5 models and upload the model with best F1-Score on development set. Results on development set are in brackets:

Model Run 1 Run 2 Run 3 Run 4 Run 5 (This) Avg.
GWLMS BERT (5e-05, 3e) (87.27) / 87.28 (87.20) / 87.42 (88.05) / 87.68 (88.25) / 87.59 (88.47) / 87.45 (87.85) / 87.48

The best model achieves a final test score of 87.45%:

Scripts for training and evaluation are also available.

Usage

The fine-tuned model can be used like:

from span_marker import SpanMarkerModel

# Download from the 🤗 Hub
model = SpanMarkerModel.from_pretrained("stefan-it/span-marker-bert-germeval14")

# Run inference
entities = model.predict("Jürgen Schmidhuber studierte ab 1983 Informatik und Mathematik an der TU München .")