war_tl_model

This model is a fine-tuned version of t5-small on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0083
  • Bleu: 95.2691
  • Gen Len: 5.3275

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 200
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
No log 1.0 54 2.8093 2.5958 6.0523
No log 2.0 108 2.4043 3.1846 6.1382
No log 3.0 162 1.9327 6.8308 6.4901
No log 4.0 216 1.5969 13.8714 5.6562
No log 5.0 270 1.2099 20.5562 5.9721
No log 6.0 324 0.9304 31.1495 5.7038
No log 7.0 378 0.7074 43.6407 5.7619
No log 8.0 432 0.5408 49.2356 5.5772
No log 9.0 486 0.3822 63.1038 5.5528
1.9648 10.0 540 0.2888 67.2835 5.5041
1.9648 11.0 594 0.1852 72.4324 5.3449
1.9648 12.0 648 0.1235 84.0315 5.36
1.9648 13.0 702 0.0831 88.3721 5.374
1.9648 14.0 756 0.0629 87.43 5.3531
1.9648 15.0 810 0.0515 88.0698 5.3577
1.9648 16.0 864 0.0526 89.6299 5.3902
1.9648 17.0 918 0.0454 89.7151 5.3879
1.9648 18.0 972 0.0434 88.0326 5.3879
0.4211 19.0 1026 0.0375 89.9125 5.3229
0.4211 20.0 1080 0.0295 91.976 5.3554
0.4211 21.0 1134 0.0441 91.7403 5.3693
0.4211 22.0 1188 0.0290 92.0153 5.3461
0.4211 23.0 1242 0.0318 90.8522 5.3391
0.4211 24.0 1296 0.0343 91.9239 5.3856
0.4211 25.0 1350 0.0260 87.7878 5.3519
0.4211 26.0 1404 0.0332 90.3633 5.3751
0.4211 27.0 1458 0.0269 92.1404 5.3717
0.1559 28.0 1512 0.0323 93.0887 5.36
0.1559 29.0 1566 0.0326 94.5354 5.3566
0.1559 30.0 1620 0.0314 93.4507 5.374
0.1559 31.0 1674 0.0297 94.7939 5.3357
0.1559 32.0 1728 0.0282 92.2858 5.3531
0.1559 33.0 1782 0.0258 92.4661 5.3508
0.1559 34.0 1836 0.0252 91.6147 5.3577
0.1559 35.0 1890 0.0240 93.2291 5.3728
0.1559 36.0 1944 0.0157 93.4177 5.3844
0.1559 37.0 1998 0.0212 94.0209 5.3589
0.093 38.0 2052 0.0199 93.1765 5.3728
0.093 39.0 2106 0.0257 93.9608 5.3624
0.093 40.0 2160 0.0232 93.9594 5.3717
0.093 41.0 2214 0.0198 93.5332 5.3519
0.093 42.0 2268 0.0150 93.9354 5.3682
0.093 43.0 2322 0.0156 94.5189 5.3566
0.093 44.0 2376 0.0170 92.767 5.36
0.093 45.0 2430 0.0178 95.2076 5.3519
0.093 46.0 2484 0.0217 93.4226 5.3995
0.0655 47.0 2538 0.0181 93.0419 5.3612
0.0655 48.0 2592 0.0185 94.4578 5.3589
0.0655 49.0 2646 0.0210 93.3838 5.3577
0.0655 50.0 2700 0.0152 93.883 5.331
0.0655 51.0 2754 0.0182 93.8614 5.3914
0.0655 52.0 2808 0.0160 94.1816 5.3426
0.0655 53.0 2862 0.0158 94.2294 5.3484
0.0655 54.0 2916 0.0135 94.4382 5.3508
0.0655 55.0 2970 0.0151 93.8986 5.3612
0.0517 56.0 3024 0.0113 95.2691 5.3484
0.0517 57.0 3078 0.0130 95.0307 5.3519
0.0517 58.0 3132 0.0137 95.2281 5.3705
0.0517 59.0 3186 0.0115 95.2281 5.3786
0.0517 60.0 3240 0.0130 95.2486 5.3589
0.0517 61.0 3294 0.0119 95.2486 5.3635
0.0517 62.0 3348 0.0134 95.2486 5.3473
0.0517 63.0 3402 0.0151 95.1871 5.3798
0.0517 64.0 3456 0.0141 95.2076 5.3566
0.0357 65.0 3510 0.0139 94.6668 5.3566
0.0357 66.0 3564 0.0122 95.2281 5.3403
0.0357 67.0 3618 0.0172 95.2076 5.3484
0.0357 68.0 3672 0.0162 94.7725 5.3403
0.0357 69.0 3726 0.0121 95.2281 5.3473
0.0357 70.0 3780 0.0163 94.6668 5.3624
0.0357 71.0 3834 0.0117 95.2486 5.3473
0.0357 72.0 3888 0.0151 95.2486 5.3566
0.0357 73.0 3942 0.0104 95.2691 5.3554
0.0357 74.0 3996 0.0098 95.2691 5.3415
0.0342 75.0 4050 0.0117 95.2486 5.3438
0.0342 76.0 4104 0.0125 94.6872 5.367
0.0342 77.0 4158 0.0103 95.2486 5.3461
0.0342 78.0 4212 0.0113 95.2281 5.3635
0.0342 79.0 4266 0.0119 95.2691 5.374
0.0342 80.0 4320 0.0132 93.4378 5.3577
0.0342 81.0 4374 0.0102 94.728 5.3496
0.0342 82.0 4428 0.0156 94.6872 5.3821
0.0342 83.0 4482 0.0097 94.728 5.3357
0.0292 84.0 4536 0.0096 95.2486 5.3693
0.0292 85.0 4590 0.0104 95.2691 5.3647
0.0292 86.0 4644 0.0110 94.7064 5.3612
0.0292 87.0 4698 0.0094 94.7268 5.3496
0.0292 88.0 4752 0.0115 95.2486 5.36
0.0292 89.0 4806 0.0098 95.2691 5.36
0.0292 90.0 4860 0.0104 94.5404 5.3461
0.0292 91.0 4914 0.0103 94.6538 5.36
0.0292 92.0 4968 0.0096 95.2691 5.3624
0.0243 93.0 5022 0.0092 95.2486 5.3647
0.0243 94.0 5076 0.0095 95.2691 5.3461
0.0243 95.0 5130 0.0105 95.0189 5.3508
0.0243 96.0 5184 0.0111 95.1994 5.3763
0.0243 97.0 5238 0.0099 95.2691 5.3717
0.0243 98.0 5292 0.0102 95.2691 5.3484
0.0243 99.0 5346 0.0101 95.2691 5.374
0.0243 100.0 5400 0.0097 95.2486 5.3426
0.0243 101.0 5454 0.0095 95.2691 5.3508
0.0233 102.0 5508 0.0098 95.2691 5.3531
0.0233 103.0 5562 0.0095 95.2691 5.3624
0.0233 104.0 5616 0.0091 95.2691 5.3461
0.0233 105.0 5670 0.0105 95.2691 5.36
0.0233 106.0 5724 0.0137 95.2486 5.3554
0.0233 107.0 5778 0.0108 95.2691 5.3577
0.0233 108.0 5832 0.0094 95.2691 5.3717
0.0233 109.0 5886 0.0095 95.2691 5.3531
0.0233 110.0 5940 0.0096 95.2691 5.3415
0.0233 111.0 5994 0.0094 95.2486 5.3589
0.02 112.0 6048 0.0092 95.2486 5.3519
0.02 113.0 6102 0.0091 94.905 5.3635
0.02 114.0 6156 0.0091 95.2691 5.3624
0.02 115.0 6210 0.0090 95.2691 5.3368
0.02 116.0 6264 0.0094 95.2486 5.3542
0.02 117.0 6318 0.0133 95.2486 5.3519
0.02 118.0 6372 0.0112 95.2691 5.3531
0.02 119.0 6426 0.0115 95.2486 5.3496
0.02 120.0 6480 0.0091 95.2691 5.3391
0.0181 121.0 6534 0.0089 95.2691 5.3368
0.0181 122.0 6588 0.0090 95.2691 5.3647
0.0181 123.0 6642 0.0096 95.2691 5.3786
0.0181 124.0 6696 0.0091 95.2691 5.381
0.0181 125.0 6750 0.0093 95.2691 5.3531
0.0181 126.0 6804 0.0098 95.2691 5.3554
0.0181 127.0 6858 0.0093 95.2691 5.3624
0.0181 128.0 6912 0.0089 95.2691 5.3693
0.0181 129.0 6966 0.0088 95.2691 5.374
0.0155 130.0 7020 0.0094 95.2691 5.36
0.0155 131.0 7074 0.0091 95.2691 5.3415
0.0155 132.0 7128 0.0088 95.2691 5.3484
0.0155 133.0 7182 0.0090 95.2691 5.3624
0.0155 134.0 7236 0.0088 95.2691 5.3554
0.0155 135.0 7290 0.0089 95.2691 5.3693
0.0155 136.0 7344 0.0090 95.2691 5.3577
0.0155 137.0 7398 0.0094 95.2486 5.3357
0.0155 138.0 7452 0.0092 95.2691 5.3368
0.0147 139.0 7506 0.0090 95.2691 5.3508
0.0147 140.0 7560 0.0089 95.2691 5.3647
0.0147 141.0 7614 0.0090 95.2691 5.3577
0.0147 142.0 7668 0.0089 95.2691 5.3531
0.0147 143.0 7722 0.0090 95.2691 5.3484
0.0147 144.0 7776 0.0096 94.112 5.3519
0.0147 145.0 7830 0.0090 95.2691 5.3624
0.0147 146.0 7884 0.0090 95.2691 5.3647
0.0147 147.0 7938 0.0090 95.2691 5.36
0.0147 148.0 7992 0.0090 95.2691 5.3647
0.0146 149.0 8046 0.0093 95.2691 5.3624
0.0146 150.0 8100 0.0090 95.2691 5.367
0.0146 151.0 8154 0.0087 95.2691 5.3531
0.0146 152.0 8208 0.0090 95.2691 5.3484
0.0146 153.0 8262 0.0088 95.2691 5.3554
0.0146 154.0 8316 0.0088 94.728 5.3612
0.0146 155.0 8370 0.0086 95.2691 5.3554
0.0146 156.0 8424 0.0085 95.2691 5.3461
0.0146 157.0 8478 0.0085 95.2691 5.3415
0.0125 158.0 8532 0.0084 95.2691 5.3484
0.0125 159.0 8586 0.0086 95.2691 5.3647
0.0125 160.0 8640 0.0088 95.2691 5.3368
0.0125 161.0 8694 0.0086 95.2691 5.3415
0.0125 162.0 8748 0.0086 95.2691 5.3508
0.0125 163.0 8802 0.0087 95.2691 5.3647
0.0125 164.0 8856 0.0086 95.2691 5.3531
0.0125 165.0 8910 0.0086 95.2691 5.3461
0.0125 166.0 8964 0.0086 95.2691 5.3508
0.012 167.0 9018 0.0087 95.2691 5.3415
0.012 168.0 9072 0.0087 95.2691 5.3577
0.012 169.0 9126 0.0087 95.2691 5.3508
0.012 170.0 9180 0.0086 95.2691 5.36
0.012 171.0 9234 0.0086 95.2691 5.3577
0.012 172.0 9288 0.0086 95.2691 5.3717
0.012 173.0 9342 0.0084 95.2691 5.3624
0.012 174.0 9396 0.0085 95.2691 5.3647
0.012 175.0 9450 0.0084 95.2691 5.3577
0.0116 176.0 9504 0.0084 95.2691 5.3554
0.0116 177.0 9558 0.0083 95.2691 5.3438
0.0116 178.0 9612 0.0084 95.2691 5.36
0.0116 179.0 9666 0.0084 95.2691 5.36
0.0116 180.0 9720 0.0085 95.2691 5.3415
0.0116 181.0 9774 0.0084 95.2691 5.3484
0.0116 182.0 9828 0.0084 95.2691 5.3484
0.0116 183.0 9882 0.0084 95.2691 5.3461
0.0116 184.0 9936 0.0084 95.2691 5.3508
0.0116 185.0 9990 0.0083 95.2691 5.3438
0.0103 186.0 10044 0.0082 95.2691 5.3438
0.0103 187.0 10098 0.0083 95.2691 5.3484
0.0103 188.0 10152 0.0083 95.2691 5.3368
0.0103 189.0 10206 0.0083 95.2691 5.3415
0.0103 190.0 10260 0.0083 95.2691 5.3298
0.0103 191.0 10314 0.0083 95.2691 5.3275
0.0103 192.0 10368 0.0083 95.2691 5.3275
0.0103 193.0 10422 0.0083 95.2691 5.3252
0.0103 194.0 10476 0.0083 95.2691 5.3275
0.0105 195.0 10530 0.0083 95.2691 5.3275
0.0105 196.0 10584 0.0083 95.2691 5.3275
0.0105 197.0 10638 0.0083 95.2691 5.3275
0.0105 198.0 10692 0.0083 95.2691 5.3275
0.0105 199.0 10746 0.0083 95.2691 5.3275
0.0105 200.0 10800 0.0083 95.2691 5.3275

Framework versions

  • Transformers 4.35.2
  • Pytorch 2.1.0+cu121
  • Datasets 2.16.1
  • Tokenizers 0.15.0
Downloads last month
11
Safetensors
Model size
60.5M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for youdiniplays/war_tl_model

Base model

google-t5/t5-small
Finetuned
(1637)
this model