--- language: - bar library_name: flair pipeline_tag: token-classification base_model: deepset/gbert-large widget: - text: >- Dochau ( amtli : Dochau ) is a Grouße Kroasstod in Obabayern nordwestli vo Minga und liagt im gleichnoming Landkroas . tags: - flair - token-classification - sequence-tagger-model - arxiv:2403.12749 - "O'zapft is!" - 🥨 license: apache-2.0 --- # Flair NER Model for Recognizing Named Entities in Bavarian Dialectal Data (Wikipedia) [![🥨](https://huggingface.co/stefan-it/flair-barner-wiki-coarse-gbert-large/resolve/main/logo.webp "🥨")](https://huggingface.co/stefan-it/flair-barner-wiki-coarse-gbert-large) This (unofficial) Flair NER model was trained on annotated Bavarian Wikipedia articles from the BarNER dataset that was proposed in the ["Sebastian, Basti, Wastl?! Recognizing Named Entities in Bavarian Dialectal Data"](https://aclanthology.org/2024.lrec-main.1262/) LREC-COLING 2024 paper (and on [arXiv](https://arxiv.org/abs/2403.12749)) by Siyao Peng, Zihang Sun, Huangyan Shan, Marie Kolm, Verena Blaschke, Ekaterina Artemova and Barbara Plank. The [released dataset](https://github.com/mainlp/BarNER) is used in the *coarse* setting that is shown in Table 3 in the paper. The following Named Entities are available: * `PER` * `LOC` * `ORG` * `MISC` ## Fine-Tuning We perform a hyper-parameter search over the following parameters: * Batch Sizes: `[32, 16]` * Learning Rates: `[7e-06, 8e-06, 9e-06, 1e-05]` * Epochs: `[20]` * Subword Pooling: `[first]` As base model we use [GBERT Large](https://huggingface.co/deepset/gbert-large). We use three different seeds to report the averaged F1-Score on the development set: | Configuration | Run 1 | Run 2 | Run 3 | Avg. | |:-------------------|:--------|:--------|:--------|:-------------| | `bs32-e20-lr1e-05` | 76.96 | 77 | **77.71** | 77.22 ± 0.34 | | `bs32-e20-lr8e-06` | 76.75 | 76.21 | 77.38 | 76.78 ± 0.48 | | `bs16-e20-lr1e-05` | 76.81 | 76.29 | 76.02 | 76.37 ± 0.33 | | `bs32-e20-lr7e-06` | 75.44 | 76.71 | 75.9 | 76.02 ± 0.52 | | `bs32-e20-lr9e-06` | 75.69 | 75.99 | 76.2 | 75.96 ± 0.21 | | `bs16-e20-lr8e-06` | 74.82 | 76.83 | 76.14 | 75.93 ± 0.83 | | `bs16-e20-lr7e-06` | 76.77 | 74.82 | 76.04 | 75.88 ± 0.8 | | `bs16-e20-lr9e-06` | 76.55 | 74.25 | 76.54 | 75.78 ± 1.08 | The hyper-parameter configuration `bs32-e20-lr1e-05` yields to best results on the development set and we use this configuration to report the averaged F1-Score on the test set: | Configuration | Run 1 | Run 2 | Run 3 | Avg. | |:-------------------|:--------|:--------|:--------|:-------------| | `bs32-e20-lr1e-05` | 72.1 | 74.33 | **72.97** | 73.13 ± 0.92 | Our averaged result on test set is higher than the reported 72.17 in the original paper (see Table 5, in-domain training results). For upload we used the best performing model on the development set, which is marked in bold. It achieves 72.97 on final test set. # Flair Demo The following snippet shows how to use the fine-tuned NER models with Flair: ```python from flair.data import Sentence from flair.models import SequenceTagger # load tagger tagger = SequenceTagger.load("stefan-it/flair-barner-wiki-coarse-gbert-large") # make example sentence sentence = Sentence("Dochau ( amtli : Dochau ) is a Grouße Kroasstod in Obabayern nordwestli vo Minga und liagt im gleichnoming Landkroas .") # predict NER tags tagger.predict(sentence) # print sentence print(sentence) # print predicted NER spans print('The following NER tags are found:') # iterate over entities and print for entity in sentence.get_spans('ner'): print(entity) ```