Jayveersinh-Raj commited on
Commit
260a2c5
·
1 Parent(s): fb44f03

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +221 -0
README.md CHANGED
@@ -1,3 +1,224 @@
1
  ---
2
  license: apache-2.0
 
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
+ tags:
4
+ - Cross-lingual-nlp
5
+ - zero-shot-transfer
6
+ - toxicity-analysis
7
+ - abuse-detection
8
+ - flag-user
9
+ - block-user
10
+ - multilinguality
11
+ - XLM-R
12
  ---
13
+
14
+ # Model Card for Model ID
15
+
16
+ <!-- Provide a quick summary of what the model is/does. -->
17
+
18
+ This model aims to help developers, especially those with little to no experience in NLP, use our model directly to flag or block users from their platforms. Our model aims to work with any arbitrary language, as long as it is supported by the XLM-R vector space aligner embedder model. #Abuse detection #Toxicity analysis #Obscene language detection
19
+
20
+ Langauges supported:
21
+ - Afrikaans
22
+ - Albanian
23
+ - Amharic
24
+ - Arabic
25
+ - Armenian
26
+ - Assamese
27
+ - Azerbaijani
28
+ - Basque
29
+ - Belarusian
30
+ - Bengali
31
+ - Bhojpuri
32
+ - Bosnian
33
+ - Bulgarian
34
+ - Burmese
35
+ - Catalan
36
+ - Cebuano
37
+ - Chewa
38
+ - Chinese (Simplified)
39
+ - Chinese (Traditional)
40
+ - Chittagonian
41
+ - Corsican
42
+ - Croatian
43
+ - Czech
44
+ - Danish
45
+ - Deccan
46
+ - Dutch
47
+ - English
48
+ - Esperanto
49
+ - Estonian
50
+ - Filipino
51
+ - Finnish
52
+ - French
53
+ - Frisian
54
+ - Galician
55
+ - Georgian
56
+ - German
57
+ - Greek
58
+ - Gujarati
59
+ - Haitian Creole
60
+ - Hausa
61
+ - Hawaiian
62
+ - Hebrew
63
+ - Hindi
64
+ - Hmong
65
+ - Hungarian
66
+ - Icelandic
67
+ - Igbo
68
+ - Indonesian
69
+ - Irish
70
+ - Italian
71
+ - Japanese
72
+ - Javanese
73
+ - Kannada
74
+ - Kazakh
75
+ - Khmer
76
+ - Kinyarwanda
77
+ - Kirundi
78
+ - Korean
79
+ - Kurdish
80
+ - Kyrgyz
81
+ - Lao
82
+ - Latin
83
+ - Latvian
84
+ - Lithuanian
85
+ - Luxembourgish
86
+ - Macedonian
87
+ - Malagasy
88
+ - Malay
89
+ - Malayalam
90
+ - Maltese
91
+ - Maori
92
+ - Marathi
93
+ - Mongolian
94
+ - Nepali
95
+ - Norwegian
96
+ - Oriya
97
+ - Oromo
98
+ - Pashto
99
+ - Persian
100
+ - Polish
101
+ - Portuguese
102
+ - Punjabi
103
+ - Quechua
104
+ - Romanian
105
+ - Russian
106
+ - Samoan
107
+ - Scots Gaelic
108
+ - Serbian
109
+ - Shona
110
+ - Sindhi
111
+ - Sinhala
112
+ - Slovak
113
+ - Slovenian
114
+ - Somali
115
+ - Spanish
116
+ - Sundanese
117
+ - Swahili
118
+ - Swedish
119
+ - Tajik
120
+ - Tamil
121
+
122
+ ## Model Details
123
+
124
+ ### Model Description
125
+
126
+ <!-- Provide a longer summary of what this model is. -->
127
+
128
+
129
+
130
+ - **Developed by:** [Jayveersinh Raj](https://www.linkedin.com/in/jayveersinh-raj-67694222a/), [Khush Patel](https://www.linkedin.com/in/khush-patel-kp/)
131
+ - **Model type:** Cross-lingual-zero-shot-transfer
132
+ - **Language(s) (NLP):** Pytorch, ONNX
133
+ - **License:** apache-2.0
134
+
135
+ ### Model Sources [optional]
136
+
137
+ <!-- Provide the basic links for the model. -->
138
+
139
+ - **Repository:** https://github.com/Jayveersinh-Raj/cross-lingual-zero-shot-transfer
140
+ - **Paper [optional]:** Everything is in the above github repository Make sure to give it a star if it is useful.
141
+ - **Demo [optional]:**
142
+
143
+ ## Uses
144
+
145
+ <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
146
+ This model aims to help developers, especially those with little to no experience in NLP, use our model directly to flag or block users from their platforms. Our model aims to work with any arbitrary language, as long as it is supported by the XLM-R vector space aligner embedder model. #Abuse detection #Toxicity analysis #Obscene language detection
147
+
148
+ ### Direct Use
149
+
150
+ <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
151
+ Just use the model from hugging face directly
152
+
153
+
154
+ ### Downstream Use [optional]
155
+
156
+ <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
157
+ The model fine tuning is not needed the model already performs well. However can be fine tuned to add languages that are written with different scripts since our model does not perform on language with different script then the source.
158
+
159
+
160
+ ### Out-of-Scope Use
161
+
162
+ <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
163
+ This model does not work with a language written in different script. The transliteration layer has not been added yet.
164
+ Moreover, our model flags mostly severe toxicity, since toxicity is a subjective matter. However, in context of flagging or blocking users severty is very important, and our model is very well balanced in that aspect.
165
+
166
+
167
+ ## Bias, Risks, and Limitations
168
+
169
+ <!-- This section is meant to convey both technical and sociotechnical limitations. -->
170
+ Toxicity is a subjective issue, however the model is very well balanced to flag mostly severe toxicity. The model has never flagged non toxic sentence as toxic. Its performance on non toxicity is 100%, making it a very good choice for the purpose of flagging or blocking users. In addition, if the language is very low resource, then the model might misclassify, but the performance is still good.
171
+
172
+
173
+ ### Recommendations
174
+
175
+ <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
176
+
177
+ Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model.
178
+
179
+ ## How to Get Started with the Model
180
+
181
+ Use the code below to get started with the model.
182
+
183
+
184
+ ## Training Details
185
+
186
+ ### Training Data
187
+
188
+ <!-- This should link to a Data Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
189
+ The training data involves the data from google jigsaw, and wikipidea. The training language is english, but zero shot mechanism is used to achieve multilinguality using vector space alignment.
190
+
191
+ ### Training Procedure
192
+
193
+ <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
194
+
195
+ #### Preprocessing [optional]
196
+
197
+ We merged all the sub categories of toxicity to create a super category of toxicity, since all of them are severe, flaggable, and/or blockable.
198
+ Class imbalance was present, but state of the art transformer architecture can handle it well.
199
+
200
+ ## Evaluation
201
+
202
+ <!-- This section describes the evaluation protocols and provides the results. -->
203
+ The model better than GPT4, and a human annotator that annotated the comments of test set as toxic. We arrived at this conclusion because 1. They were manually checked, and 2. On being GPT4 refused to generate toxic sentences, but on being passed the texts from test set where model flagged it non toxic, but were flagged toxic by user, GPT4 translated it, generated it, and said they were toxic, but they were not toxic enough to be blocked or flagged. Hence, our model is near to perfect in this regard. However, limitations, and risks should be taken into account.
204
+
205
+ ### Testing Data, Factors & Metrics
206
+ 1. Tested on human annotations
207
+ 2. Tested on GPT4 generated texts
208
+
209
+ #### Testing Data
210
+
211
+ <!-- This should link to a Data Card if possible. -->
212
+ The dataset is available on github
213
+
214
+ #### Metrics
215
+
216
+ <!-- These are the evaluation metrics being used, ideally with a description of why. -->
217
+ Top-1 accuracy, since our data contains multiple langauges.
218
+
219
+ ### Results
220
+ 1. Tested on human annotations &rarr; 100% on non toxic sentences, better than human, as discussed in evaluation.
221
+ 2. Tested on GPT4 generated texts &rarr; 100%
222
+
223
+ #### Summary
224
+ Our model is very good for the use case of flagging or blocking users with severe toxic comments, like using swear words or slangs. It is ideal for the purpose because it only flags severe toxicity, and 100% accurate on non-toxic comments. However all of the above should be taken into consideration before using it. It supports all the languages that are supported by XLM-R vector space aligner. The list is as follows: