shainaraza
commited on
Commit
·
9a5b29f
1
Parent(s):
f2f238b
Update README.md
Browse files
README.md
CHANGED
@@ -17,6 +17,25 @@ This model is a text classification model trained on a large dataset of comments
|
|
17 |
|
18 |
This model is intended to be used to automatically detect and flag potentially biased language in user-generated comments in various online platforms. It can also be used as a component in a larger pipeline for text classification, sentiment analysis, or bias detection tasks.
|
19 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
20 |
## Training data
|
21 |
|
22 |
The model was trained on a labeled dataset of comments from various online platforms, which were annotated as toxic or non-toxic by human annotators. The training data was cleaned and preprocessed before training, and a variety of data augmentation techniques were used to increase the amount of training data and improve the model's robustness to various types of biases.
|
|
|
17 |
|
18 |
This model is intended to be used to automatically detect and flag potentially biased language in user-generated comments in various online platforms. It can also be used as a component in a larger pipeline for text classification, sentiment analysis, or bias detection tasks.
|
19 |
|
20 |
+
`````
|
21 |
+
import torch
|
22 |
+
from transformers import AutoTokenizer, AutoModelForSequenceClassification
|
23 |
+
|
24 |
+
tokenizer = AutoTokenizer.from_pretrained("shainaraza/toxity_classify_debiaser")
|
25 |
+
|
26 |
+
model = AutoModelForSequenceClassification.from_pretrained("shainaraza/toxity_classify_debiaser")
|
27 |
+
|
28 |
+
# Test the model with a sample comment
|
29 |
+
comment = "you are a dumb person."
|
30 |
+
inputs = tokenizer(comment, return_tensors="pt")
|
31 |
+
outputs = model(**inputs)
|
32 |
+
prediction = torch.argmax(outputs.logits, dim=1).item()
|
33 |
+
|
34 |
+
print(f"Comment: {comment}")
|
35 |
+
print(f"Prediction: {'biased' if prediction == 1 else 'not biased'}")
|
36 |
+
|
37 |
+
`````
|
38 |
+
|
39 |
## Training data
|
40 |
|
41 |
The model was trained on a labeled dataset of comments from various online platforms, which were annotated as toxic or non-toxic by human annotators. The training data was cleaned and preprocessed before training, and a variety of data augmentation techniques were used to increase the amount of training data and improve the model's robustness to various types of biases.
|