Update README.md
Browse files
README.md
CHANGED
@@ -25,7 +25,7 @@ tokenizer = AutoTokenizer.from_pretrained("browndw/docusco-bert")
|
|
25 |
model = AutoModelForTokenClassification.from_pretrained("browndw/docusco-bert")
|
26 |
|
27 |
nlp = pipeline("ner", model=model, tokenizer=tokenizer)
|
28 |
-
example = "
|
29 |
|
30 |
ner_results = nlp(example)
|
31 |
print(ner_results)
|
@@ -33,7 +33,7 @@ print(ner_results)
|
|
33 |
|
34 |
#### Limitations and bias
|
35 |
|
36 |
-
This model is limited by its training dataset of
|
37 |
|
38 |
## Training data
|
39 |
|
@@ -53,12 +53,12 @@ This model was trained on a single 2.3 GHz Dual-Core Intel Core i5 with recommen
|
|
53 |
### Overall
|
54 |
metric|test
|
55 |
-|-
|
56 |
-
f1
|
57 |
-
accuracy
|
58 |
|
59 |
### By category
|
60 |
-
precision|recall|f1-score|support
|
61 |
-
|
62 |
AcademicTerms|0.69|0.70|0.69|54204
|
63 |
AcademicWritingMoves|0.31|0.40|0.35|2860
|
64 |
Character|0.68|0.70|0.69|86213
|
|
|
25 |
model = AutoModelForTokenClassification.from_pretrained("browndw/docusco-bert")
|
26 |
|
27 |
nlp = pipeline("ner", model=model, tokenizer=tokenizer)
|
28 |
+
example = "Globalization is the process of interaction and integration among people, companies, and governments worldwide."
|
29 |
|
30 |
ner_results = nlp(example)
|
31 |
print(ner_results)
|
|
|
33 |
|
34 |
#### Limitations and bias
|
35 |
|
36 |
+
This model is limited by its training dataset of American English texts. Moreover, the current version is trained on only a small subset of the corpus. The goal is to train later versions on more data, which should increase accuracy.
|
37 |
|
38 |
## Training data
|
39 |
|
|
|
53 |
### Overall
|
54 |
metric|test
|
55 |
-|-
|
56 |
+
f1 |.663
|
57 |
+
accuracy |.747
|
58 |
|
59 |
### By category
|
60 |
+
category|precision|recall|f1-score|support
|
61 |
+
-|-|-|-|-
|
62 |
AcademicTerms|0.69|0.70|0.69|54204
|
63 |
AcademicWritingMoves|0.31|0.40|0.35|2860
|
64 |
Character|0.68|0.70|0.69|86213
|