--- license: mit base_model: facebook/m2m100_418M tags: - generated_from_trainer metrics: - bleu model-index: - name: m2m100_418M-finetuned-en-to-hi results: [] --- # m2m100_418M-finetuned-en-to-hi This model is a fine-tuned version of [facebook/m2m100_418M](https://huggingface.co/facebook/m2m100_418M) on the None dataset. It achieves the following results on the evaluation set: - Loss: 1.0453 - Bleu: 17.4993 - Gen Len: 6.7284 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 2e-05 - train_batch_size: 48 - eval_batch_size: 48 - seed: 42 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - num_epochs: 5 - mixed_precision_training: Native AMP ### Training results | Training Loss | Epoch | Step | Validation Loss | Bleu | Gen Len | |:-------------:|:-----:|:-----:|:---------------:|:-------:|:-------:| | 2.4274 | 0.16 | 500 | 2.1152 | 4.4935 | 6.8813 | | 2.1915 | 0.33 | 1000 | 1.9722 | 5.8486 | 6.9727 | | 2.1187 | 0.49 | 1500 | 1.8575 | 5.5802 | 6.9993 | | 2.0151 | 0.66 | 2000 | 1.7686 | 8.8892 | 6.8233 | | 1.9709 | 0.82 | 2500 | 1.6948 | 8.4082 | 6.8809 | | 1.9376 | 0.99 | 3000 | 1.6341 | 10.0801 | 6.85 | | 1.761 | 1.15 | 3500 | 1.5788 | 8.1916 | 6.8816 | | 1.7269 | 1.32 | 4000 | 1.5380 | 10.2779 | 6.9447 | | 1.7231 | 1.48 | 4500 | 1.4946 | 6.9244 | 6.9402 | | 1.6925 | 1.65 | 5000 | 1.4456 | 13.7246 | 6.9018 | | 1.6658 | 1.81 | 5500 | 1.4146 | 9.1181 | 6.9104 | | 1.6673 | 1.98 | 6000 | 1.3727 | 8.6535 | 6.8682 | | 1.5165 | 2.14 | 6500 | 1.3441 | 14.8146 | 6.9804 | | 1.5111 | 2.31 | 7000 | 1.3101 | 11.192 | 6.92 | | 1.4889 | 2.47 | 7500 | 1.2814 | 11.8364 | 6.9509 | | 1.4903 | 2.64 | 8000 | 1.2510 | 16.8035 | 6.9316 | | 1.4871 | 2.8 | 8500 | 1.2298 | 14.5766 | 6.9053 | | 1.4854 | 2.97 | 9000 | 1.2051 | 14.2822 | 6.8438 | | 1.3719 | 3.13 | 9500 | 1.1758 | 16.1779 | 6.8918 | | 1.3481 | 3.3 | 10000 | 1.1612 | 20.1789 | 6.8138 | | 1.3585 | 3.46 | 10500 | 1.1410 | 15.6937 | 6.8613 | | 1.35 | 3.63 | 11000 | 1.1261 | 20.0808 | 6.832 | | 1.3557 | 3.79 | 11500 | 1.1069 | 19.588 | 6.8242 | | 1.3329 | 3.96 | 12000 | 1.0924 | 19.9913 | 6.796 | | 1.2792 | 4.12 | 12500 | 1.0791 | 18.8275 | 6.7616 | | 1.2568 | 4.29 | 13000 | 1.0701 | 16.7189 | 6.7676 | | 1.2558 | 4.45 | 13500 | 1.0605 | 18.7687 | 6.7464 | | 1.2533 | 4.62 | 14000 | 1.0541 | 19.1818 | 6.7693 | | 1.2559 | 4.78 | 14500 | 1.0475 | 19.0462 | 6.738 | | 1.2513 | 4.95 | 15000 | 1.0453 | 17.4993 | 6.7284 | ### Framework versions - Transformers 4.36.2 - Pytorch 2.1.2+cu121 - Datasets 2.16.1 - Tokenizers 0.15.0