|
--- |
|
datasets: |
|
- Helsinki-NLP/kde4 |
|
language: |
|
- en |
|
- fr |
|
metrics: |
|
- sacrebleu |
|
base_model: |
|
- Helsinki-NLP/opus-mt-en-fr |
|
pipeline_tag: translation |
|
library_name: transformers |
|
--- |
|
# Marian Fine-tuned English-French Translation Model |
|
|
|
## Model Description |
|
|
|
This model is a fine-tuned version of `Helsinki-NLP/opus-mt-en-fr`, specifically trained for English to French translation. The base model was further trained on the `KDE4` dataset to improve translation quality for technical and software-related content. |
|
|
|
## Model Training Details |
|
|
|
### Training Dataset |
|
- **Dataset**: KDE4 Dataset (English-French parallel corpus) |
|
- **Split Distribution**: |
|
- Training set: 189,155 examples (90%) |
|
- Test set: 21,018 examples (10%) |
|
|
|
### Training Configuration |
|
- **Base Model**: Helsinki-NLP/opus-mt-en-fr |
|
- **Training Arguments**: |
|
- Learning rate: 2e-5 |
|
- Batch size: 32 (training), 64 (evaluation) |
|
- Number of epochs: 10 |
|
- Weight decay: 0.01 |
|
- FP16 training enabled |
|
- Evaluation strategy: Before and after training |
|
- Checkpoint saving: Every epoch (maximum 3 saved) |
|
- Training device: GPU with mixed precision (fp16) |
|
|
|
## Model Results |
|
|
|
### Evaluation Metrics |
|
|
|
The model was evaluated using the BLEU score. The evaluation results before and after training are summarized in the table below: |
|
|
|
| **Stage** | **Eval Loss** | **BLEU Score** | |
|
|--------------------|---------------|----------------| |
|
| **Before Training** | 1.700 | 38.97 | |
|
| **After Training** | 0.796 | 54.96 | |
|
|
|
|
|
### Training Loss |
|
|
|
The training loss decreased over the epochs, indicating that the model was learning effectively. The final training loss was approximately 0.710. |
|
|
|
|
|
## Model Usage |
|
|
|
|
|
```python |
|
from transformers import pipeline |
|
|
|
model_checkpoint = "Prikshit7766/marian-finetuned-kde4-en-to-fr" |
|
translator = pipeline("translation", model=model_checkpoint) |
|
translator("Default to expanded threads") |
|
``` |
|
|
|
### Example Output |
|
|
|
```plaintext |
|
[{'translation_text': 'Par défaut, développer les fils de discussion'}] |
|
``` |