browndw
/

docusco-bert

Token Classification

Inference Endpoints

Model card Files Files and versions Community

browndw commited on May 14, 2021

Commit

420b806

·

1 Parent(s): 108c904

Update README.md

Files changed (1) hide show

README.md +6 -6

README.md CHANGED Viewed

@@ -25,7 +25,7 @@ tokenizer = AutoTokenizer.from_pretrained("browndw/docusco-bert")
 model = AutoModelForTokenClassification.from_pretrained("browndw/docusco-bert")
 nlp = pipeline("ner", model=model, tokenizer=tokenizer)
-example = "My name is Wolfgang and I live in Berlin"
 ner_results = nlp(example)
 print(ner_results)
@@ -33,7 +33,7 @@ print(ner_results)
 #### Limitations and bias
-This model is limited by its training dataset of entity-annotated news articles from a specific span of time. This may not generalize well for all use cases in different domains. Furthermore, the model occassionally tags subword tokens as entities and post-processing of results may be necessary to handle those cases.
 ## Training data
@@ -53,12 +53,12 @@ This model was trained on a single 2.3 GHz Dual-Core Intel Core i5 with recommen
 ### Overall
 metric|test
 -|-
-f1 |66.3
-accuracy |74.7
 ### By category
-precision|recall|f1-score|support
--|-|-|-
 AcademicTerms|0.69|0.70|0.69|54204
 AcademicWritingMoves|0.31|0.40|0.35|2860
 Character|0.68|0.70|0.69|86213

 model = AutoModelForTokenClassification.from_pretrained("browndw/docusco-bert")
 nlp = pipeline("ner", model=model, tokenizer=tokenizer)
+example = "Globalization is the process of interaction and integration among people, companies, and governments worldwide."
 ner_results = nlp(example)
 print(ner_results)
 #### Limitations and bias
+This model is limited by its training dataset of American English texts. Moreover, the current version is trained on only a small subset of the corpus. The goal is to train later versions on more data, which should increase accuracy.
 ## Training data
 ### Overall
 metric|test
 -|-
+f1 |.663
+accuracy |.747
 ### By category
+category|precision|recall|f1-score|support
+-|-|-|-|-
 AcademicTerms|0.69|0.70|0.69|54204
 AcademicWritingMoves|0.31|0.40|0.35|2860
 Character|0.68|0.70|0.69|86213