Update README.md
Browse files
README.md
CHANGED
@@ -6,10 +6,12 @@ datasets: COCA
|
|
6 |
|
7 |
## Model description
|
8 |
|
9 |
-
**docusco-bert** is a fine-tuned BERT model that is ready to use for **token classification**. The model was trained on data from the Corpus of Contemporary American English ([COCA](https://www.english-corpora.org/coca/)) and classifies tokens and token sequences according to a system developed for the [DocuScope](https://www.cmu.edu/dietrich/english/research-and-publications/docuscope.html) dictionary-based tagger. Descriptions of the categories are included in a table below.
|
10 |
|
11 |
## About DocuScope
|
12 |
-
DocuScope is a dicitonary-based tagger that has been developed at Carnegie Mellon University by David Kaufer and Suguru Ishizaki since the early 2000s. Its categories are rhetorical in their orientation (as opposed to part-of-speech tags, for example, which are morphosyntactic).
|
|
|
|
|
13 |
|
14 |
## Intended uses & limitations
|
15 |
|
@@ -147,7 +149,17 @@ Updates|References updates that anticipate someone searching for information and
|
|
147 |
|
148 |
|
149 |
### BibTeX entry and citation info
|
150 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
151 |
```
|
152 |
@article{DBLP:journals/corr/abs-1810-04805,
|
153 |
author = {Jacob Devlin and
|
|
|
6 |
|
7 |
## Model description
|
8 |
|
9 |
+
**docusco-bert** is a fine-tuned BERT model that is ready to use for **token classification**. The model was trained on data from the Corpus of Contemporary American English ([COCA](https://www.english-corpora.org/coca/)) and classifies tokens and token sequences according to a system developed for the [**DocuScope**](https://www.cmu.edu/dietrich/english/research-and-publications/docuscope.html) dictionary-based tagger. Descriptions of the categories are included in a table below.
|
10 |
|
11 |
## About DocuScope
|
12 |
+
DocuScope is a dicitonary-based tagger that has been developed at Carnegie Mellon University by **David Kaufer** and **Suguru Ishizaki** since the early 2000s. Its categories are rhetorical in their orientation (as opposed to part-of-speech tags, for example, which are morphosyntactic).
|
13 |
+
|
14 |
+
DocuScope has been been used in [a wide variety of studies](https://scholar.google.com/scholar?hl=en&as_sdt=0%2C5&q=docuscope&btnG=). Here, for example, is [a short analysis of King Lear](https://graphics.cs.wisc.edu/WP/vep/2017/02/14/guest-post-data-mining-king-lear/), and here is [a published study of Tweets](https://journals.sagepub.com/doi/full/10.1177/2055207619844865).
|
15 |
|
16 |
## Intended uses & limitations
|
17 |
|
|
|
149 |
|
150 |
|
151 |
### BibTeX entry and citation info
|
152 |
+
```
|
153 |
+
@incollection{ishizaki2012computer,
|
154 |
+
title = {Computer-aided rhetorical analysis},
|
155 |
+
author = {Ishizaki, Suguru and Kaufer, David},
|
156 |
+
booktitle= {Applied natural language processing: Identification, investigation and resolution},
|
157 |
+
pages = {276--296},
|
158 |
+
year = {2012},
|
159 |
+
publisher= {IGI Global},
|
160 |
+
url = {https://www.igi-global.com/chapter/content/61054}
|
161 |
+
}
|
162 |
+
```
|
163 |
```
|
164 |
@article{DBLP:journals/corr/abs-1810-04805,
|
165 |
author = {Jacob Devlin and
|