Model Card for uvegesistvan/wildmann_german_proposal_2b_german_to_english
Model Overview
This model is a multi-class emotion classifier trained on German-to-English machine-translated text data. It identifies nine distinct emotional states in text. The model's performance reflects the impact of training on machine-translated datasets, emphasizing its ability to generalize across linguistic variations.
Emotion Classes
The model classifies the following emotional states:
- Anger (0)
- Fear (1)
- Disgust (2)
- Sadness (3)
- Joy (4)
- Enthusiasm (5)
- Hope (6)
- Pride (7)
- No emotion (8)
Dataset and Preprocessing
The dataset consists of German text that has been machine-translated into English and annotated for emotional content. Preprocessing included normalization of translated text to reduce noise introduced by translation errors. Undersampling was applied to balance the most frequent classes ("Anger" and "No emotion") with less frequent ones to ensure equitable learning across all labels.
Evaluation Metrics
The model was evaluated using precision, recall, F1-score, and accuracy metrics. Below are the detailed performance metrics:
Class | Precision | Recall | F1-Score | Support |
---|---|---|---|---|
Anger (0) | 0.54 | 0.58 | 0.56 | 777 |
Fear (1) | 0.88 | 0.73 | 0.80 | 776 |
Disgust (2) | 0.93 | 0.94 | 0.94 | 776 |
Sadness (3) | 0.86 | 0.84 | 0.85 | 775 |
Joy (4) | 0.82 | 0.81 | 0.82 | 777 |
Enthusiasm (5) | 0.61 | 0.62 | 0.62 | 776 |
Hope (6) | 0.52 | 0.52 | 0.52 | 777 |
Pride (7) | 0.75 | 0.80 | 0.77 | 776 |
No emotion (8) | 0.64 | 0.65 | 0.65 | 1553 |
Overall Metrics
- Accuracy: 0.71
- Macro Average: Precision = 0.73, Recall = 0.72, F1-Score = 0.72
- Weighted Average: Precision = 0.72, Recall = 0.71, F1-Score = 0.72
Performance Insights
The model demonstrates strong performance in most emotion classes, especially for "Fear" and "Disgust." However, classes like "Hope" and "Enthusiasm" exhibit slightly lower scores, likely due to inherent challenges in identifying subtle emotions within machine-translated text.
Model Usage
Applications
- Emotion analysis of German texts via machine-translated English representations.
- Detecting emotional tone in multilingual datasets where German-English translations are present.
Limitations
- Performance depends on the quality of the machine-translated text. Errors in translation could propagate and affect classification results.
- Subtle or ambiguous emotional states may be misclassified due to translation noise or lack of context.
Ethical Considerations
As the dataset is machine-translated, cultural and linguistic nuances might be lost, leading to potential biases or misinterpretations. Users should exercise caution when applying the model to sensitive domains such as mental health or social research.
- Downloads last month
- 15