Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,30 @@
|
|
1 |
---
|
2 |
license: cc-by-nc-4.0
|
3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
license: cc-by-nc-4.0
|
3 |
---
|
4 |
+
|
5 |
+
## Model Specification
|
6 |
+
- This is the **state-of-the-art Twitter NER model (with 74.35\% Entity-Level F1)** on Tweebank V2's NER benchmark (also called `Tweebank-NER`), trained on the corpus combining both Tweebank-NER and WNUT 17 training data.
|
7 |
+
- For more details about the `TweebankNLP` project, please refer to this [our paper](https://arxiv.org/pdf/2201.07281.pdf) and [github](https://github.com/social-machines/TweebankNLP) page.
|
8 |
+
|
9 |
+
## How to use the model
|
10 |
+
|
11 |
+
```python
|
12 |
+
from transformers import AutoTokenizer, AutoModelForTokenClassification
|
13 |
+
|
14 |
+
tokenizer = AutoTokenizer.from_pretrained("TweebankNLP/bertweet-tb2_wnut17-ner")
|
15 |
+
|
16 |
+
model = AutoModelForTokenClassification.from_pretrained("TweebankNLP/bertweet-tb2_wnut17-ner")
|
17 |
+
```
|
18 |
+
|
19 |
+
## References
|
20 |
+
|
21 |
+
If you use this repository in your research, please kindly cite [our paper](https://arxiv.org/pdf/2201.07281.pdf):
|
22 |
+
|
23 |
+
```bibtex
|
24 |
+
@article{jiang2022tweetnlp,
|
25 |
+
title={Annotating the Tweebank Corpus on Named Entity Recognition and Building NLP Models for Social Media Analysis},
|
26 |
+
author={Jiang, Hang and Hua, Yining and Beeferman, Doug and Roy, Deb},
|
27 |
+
journal={In Proceedings of the 13th Language Resources and Evaluation Conference (LREC)},
|
28 |
+
year={2022}
|
29 |
+
}
|
30 |
+
```
|