|
|
|
--- |
|
license: apache-2.0 |
|
library_name: span-marker |
|
tags: |
|
- span-marker |
|
- token-classification |
|
- ner |
|
- named-entity-recognition |
|
pipeline_tag: token-classification |
|
widget: |
|
- text: >- |
|
Amelia Earhart flew her single engine Lockheed Vega 5B across the Atlantic |
|
to Paris. |
|
example_title: Amelia Earhart |
|
model-index: |
|
- name: >- |
|
SpanMarker w. xlm-roberta-large on CoNLL++ with document-level context by Tom Aarsen |
|
results: |
|
- task: |
|
type: token-classification |
|
name: Named Entity Recognition |
|
dataset: |
|
type: conllpp |
|
name: CoNLL++ w. document context |
|
split: test |
|
revision: 3e6012875a688903477cca9bf1ba644e65480bd6 |
|
metrics: |
|
- type: f1 |
|
value: 0.9554 |
|
name: F1 |
|
- type: precision |
|
value: 0.9600 |
|
name: Precision |
|
- type: recall |
|
value: 0.9509 |
|
name: Recall |
|
datasets: |
|
- conllpp |
|
- tomaarsen/conllpp |
|
language: |
|
- en |
|
metrics: |
|
- f1 |
|
- recall |
|
- precision |
|
--- |
|
|
|
# SpanMarker for Named Entity Recognition |
|
|
|
This is a [SpanMarker](https://github.com/tomaarsen/SpanMarkerNER) model that can be used for Named Entity Recognition. In particular, this SpanMarker model uses [xlm-roberta-large](https://huggingface.co/xlm-roberta-large) as the underlying encoder. See [train.py](train.py) for the training script. |
|
Note that this model was trained with document-level context, i.e. it will primarily perform well when provided with enough context. It is recommended to call `model.predict` with a 🤗 Dataset with `tokens`, `document_id` and `sentence_id` columns. |
|
See the [documentation](https://tomaarsen.github.io/SpanMarkerNER/api/span_marker.modeling.html#span_marker.modeling.SpanMarkerModel.predict) of the `model.predict` method for more information. |
|
|
|
## Usage |
|
|
|
To use this model for inference, first install the `span_marker` library: |
|
|
|
```bash |
|
pip install span_marker |
|
``` |
|
|
|
You can then run inference with this model like so: |
|
|
|
```python |
|
from span_marker import SpanMarkerModel |
|
|
|
# Download from the 🤗 Hub |
|
model = SpanMarkerModel.from_pretrained("tomaarsen/span-marker-xlm-roberta-large-conllpp-doc-context") |
|
# Run inference |
|
entities = model.predict("Amelia Earhart flew her single engine Lockheed Vega 5B across the Atlantic to Paris.") |
|
``` |
|
|
|
See the [SpanMarker](https://github.com/tomaarsen/SpanMarkerNER) repository for documentation and additional information on this library. |