T6 / README.md
shahadotb's picture
End of training
786db6f verified
|
raw
history blame
4.51 kB
metadata
license: apache-2.0
base_model: eslamxm/mt5-base-finetuned-arur
tags:
  - generated_from_trainer
model-index:
  - name: T6
    results: []

T6

This model is a fine-tuned version of eslamxm/mt5-base-finetuned-arur on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5941

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 64

Training results

Training Loss Epoch Step Validation Loss
0.2591 1.0 37 0.2616
0.1639 2.0 74 0.2497
0.1771 3.0 111 0.2448
0.1465 4.0 148 0.2486
0.1294 5.0 185 0.2499
0.118 6.0 222 0.2520
0.1014 7.0 259 0.2582
0.0986 8.0 296 0.2631
0.1021 9.0 333 0.2775
0.0783 10.0 370 0.2867
0.0699 11.0 407 0.2906
0.062 12.0 444 0.3010
0.059 13.0 481 0.3144
0.0592 14.0 518 0.3265
0.0513 15.0 555 0.3365
0.0404 16.0 592 0.3550
0.0417 17.0 629 0.3552
0.0385 18.0 666 0.3682
0.0303 19.0 703 0.3728
0.0355 20.0 740 0.3947
0.0232 21.0 777 0.4208
0.024 22.0 814 0.4080
0.023 23.0 851 0.4265
0.0169 24.0 888 0.4233
0.0185 25.0 925 0.4450
0.0214 26.0 962 0.4528
0.0159 27.0 999 0.4486
0.0156 28.0 1036 0.4926
0.017 29.0 1073 0.4927
0.0137 30.0 1110 0.4886
0.0139 31.0 1147 0.5205
0.0108 32.0 1184 0.4953
0.0136 33.0 1221 0.4925
0.0129 34.0 1258 0.5081
0.0099 35.0 1295 0.5252
0.0116 36.0 1332 0.5241
0.0134 37.0 1369 0.5352
0.0111 38.0 1406 0.5469
0.0089 39.0 1443 0.5618
0.0103 40.0 1480 0.5781
0.0083 41.0 1517 0.5896
0.0091 42.0 1554 0.5287
0.0115 43.0 1591 0.5556
0.0069 44.0 1628 0.5497
0.0069 45.0 1665 0.5896
0.0089 46.0 1702 0.5799
0.0056 47.0 1739 0.5654
0.0072 48.0 1776 0.5683
0.0097 49.0 1813 0.5642
0.0065 50.0 1850 0.5623
0.0073 51.0 1887 0.5906
0.0078 52.0 1924 0.5932
0.0068 53.0 1961 0.5923
0.006 54.0 1998 0.5978
0.005 55.0 2035 0.5846
0.0082 56.0 2072 0.5886
0.0081 57.0 2109 0.5844
0.0056 58.0 2146 0.5878
0.0069 59.0 2183 0.5890
0.0075 60.0 2220 0.5946
0.0077 61.0 2257 0.5897
0.0064 62.0 2294 0.5908
0.0049 63.0 2331 0.5934
0.005 64.0 2368 0.5941

Framework versions

  • Transformers 4.35.2
  • Pytorch 2.1.0+cu121
  • Datasets 2.16.1
  • Tokenizers 0.15.1