About the model
This model is designed for text classification, specifically for identifying offensive content in Turkish text. The model classifies text into five categories: INSULT, OTHER, PROFANITY, RACIST, and SEXIST.
Model Metrics
INSULT | OTHER | PROFANITY | RACIST | SEXIST | |
---|---|---|---|---|---|
Precision | 0.901 | 0.924 | 0.978 | 1.000 | 0.980 |
Recall | 0.920 | 0.980 | 0.900 | 0.980 | 1.000 |
F1 Score | 0.910 | 0.9514 | 0.937 | 0.989 | 0.990 |
- F-Score: 0.9559690799177005
- Recall: 0.9559999999999998
- Precision: 0.9570284225256961
- Accuracy: 0.956
Training Information
- Device : macOS 14.5 23F79 arm64 | GPU: Apple M2 Max | Memory: 5840MiB / 32768MiB
- Training completed in 0:22:54 (hh:mm:ss)
- Optimizer: AdamW
- learning_rate: 2e-5
- eps: 1e-8
- epochs: 10
- Batch size: 64
Dependency
pip install torch torchvision torchaudio
pip install tf-keras
pip install transformers
pip install tensorflow
Example
from transformers import AutoTokenizer, TFAutoModelForSequenceClassification, TextClassificationPipeline
# Load the tokenizer and model
model_name = "nanelimon/bert-base-turkish-offensive"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = TFAutoModelForSequenceClassification.from_pretrained(model_name)
# Create the pipeline
pipe = TextClassificationPipeline(model=model, tokenizer=tokenizer, return_all_scores=True, top_k=2)
# Test the pipeline
print(pipe('Bu bir denemedir hadi sende dene!'))
Result;
[[{'label': 'OTHER', 'score': 1.000}, {'label': 'INSULT', 'score': 0.000}]]
- label= It shows which class the sent Turkish text belongs to according to the model.
- score= It shows the compliance rate of the Turkish text sent to the label found.
Authors
- Seyma SARIGIL: [email protected]
License
gpl-3.0
Free Software, Hell Yeah!
- Downloads last month
- 138
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.