|
--- |
|
language: cs |
|
license: mit |
|
tags: |
|
- emotion-classification |
|
- text-analysis |
|
- machine-translation |
|
metrics: |
|
- precision |
|
- recall |
|
- f1-score |
|
- accuracy |
|
--- |
|
|
|
# Model Card for uvegesistvan/wildmann_german_proposal_2b_german_to_czech |
|
|
|
## Model Overview |
|
This model is a multi-class emotion classifier trained on German-to-Czech machine-translated text data. It identifies nine distinct emotional states in text and demonstrates how machine-translated datasets can support emotion classification tasks across different languages. |
|
|
|
### Emotion Classes |
|
The model classifies the following emotional states: |
|
|
|
- **Anger (0)** |
|
- **Fear (1)** |
|
- **Disgust (2)** |
|
- **Sadness (3)** |
|
- **Joy (4)** |
|
- **Enthusiasm (5)** |
|
- **Hope (6)** |
|
- **Pride (7)** |
|
- **No emotion (8)** |
|
|
|
### Dataset and Preprocessing |
|
The dataset includes German text machine-translated into Czech and annotated for emotional content. Both synthetic and original German sentences were translated to create a diverse corpus. Preprocessing steps included: |
|
|
|
- Balancing classes through undersampling of overrepresented labels, such as "No emotion" and "Anger." |
|
- Normalization of text to handle inconsistencies from the machine translation process. |
|
|
|
### Evaluation Metrics |
|
The model's performance was evaluated using standard classification metrics. Results are summarized below: |
|
|
|
| Class | Precision | Recall | F1-Score | Support | |
|
|---------------|-----------|--------|----------|---------| |
|
| Anger (0) | 0.50 | 0.63 | 0.56 | 777 | |
|
| Fear (1) | 0.84 | 0.74 | 0.79 | 776 | |
|
| Disgust (2) | 0.91 | 0.94 | 0.93 | 776 | |
|
| Sadness (3) | 0.87 | 0.83 | 0.85 | 775 | |
|
| Joy (4) | 0.83 | 0.81 | 0.82 | 777 | |
|
| Enthusiasm (5)| 0.61 | 0.61 | 0.61 | 776 | |
|
| Hope (6) | 0.54 | 0.46 | 0.50 | 777 | |
|
| Pride (7) | 0.75 | 0.81 | 0.78 | 776 | |
|
| No emotion (8)| 0.66 | 0.64 | 0.65 | 1553 | |
|
|
|
### Overall Metrics |
|
- **Accuracy**: 0.71 |
|
- **Macro Average**: Precision = 0.72, Recall = 0.72, F1-Score = 0.72 |
|
- **Weighted Average**: Precision = 0.72, Recall = 0.71, F1-Score = 0.71 |
|
|
|
### Performance Insights |
|
The model performs well across most classes, particularly in "Disgust" and "Fear." However, classes such as "Hope" exhibit lower F1-scores, potentially due to translation noise or subtle emotional cues being lost in machine translation. |
|
|
|
## Model Usage |
|
### Applications |
|
- Emotion analysis of German texts translated into Czech. |
|
- Sentiment tracking in Czech-language customer feedback derived from German text. |
|
- Research on cross-linguistic emotion classification in multilingual datasets. |
|
|
|
### Limitations |
|
- The model's performance is influenced by the quality of the machine-translated text, which may introduce biases or inaccuracies. |
|
- Subtle emotional states like "Hope" may be harder to classify due to translation inconsistencies. |
|
|
|
### Ethical Considerations |
|
The reliance on machine-translated datasets means that cultural and linguistic nuances may be lost, potentially impacting classification accuracy. Users should carefully evaluate the model before applying it in sensitive areas, such as mental health or customer sentiment analysis. |
|
|
|
### Citation |
|
For further information, visit: [uvegesistvan/wildmann_german_proposal_2b_german_to_czech](#) |
|
|