t5-summarization-zero-shot-headers-and-better-prompt-enriched
This model is a fine-tuned version of google/flan-t5-small on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 2.3132
- Rouge: {'rouge1': 0.426, 'rouge2': 0.195, 'rougeL': 0.2024, 'rougeLsum': 0.2024}
- Bert Score: 0.877
- Bleurt 20: -0.8149
- Gen Len: 13.66
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 7
- eval_batch_size: 7
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 20
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge | Bert Score | Bleurt 20 | Gen Len |
---|---|---|---|---|---|---|---|
2.8785 | 1.0 | 172 | 2.6476 | {'rouge1': 0.462, 'rouge2': 0.1848, 'rougeL': 0.1845, 'rougeLsum': 0.1845} | 0.8707 | -0.8319 | 15.17 |
2.6366 | 2.0 | 344 | 2.4685 | {'rouge1': 0.4501, 'rouge2': 0.1849, 'rougeL': 0.1933, 'rougeLsum': 0.1933} | 0.872 | -0.8531 | 14.545 |
2.3822 | 3.0 | 516 | 2.3766 | {'rouge1': 0.4217, 'rouge2': 0.1759, 'rougeL': 0.1867, 'rougeLsum': 0.1867} | 0.8719 | -0.8998 | 13.675 |
2.2235 | 4.0 | 688 | 2.3262 | {'rouge1': 0.4396, 'rouge2': 0.1832, 'rougeL': 0.1867, 'rougeLsum': 0.1867} | 0.8715 | -0.8847 | 14.38 |
2.0765 | 5.0 | 860 | 2.3122 | {'rouge1': 0.4143, 'rouge2': 0.1769, 'rougeL': 0.1907, 'rougeLsum': 0.1907} | 0.875 | -0.9206 | 13.37 |
2.0141 | 6.0 | 1032 | 2.2993 | {'rouge1': 0.4257, 'rouge2': 0.1867, 'rougeL': 0.1943, 'rougeLsum': 0.1943} | 0.8773 | -0.8751 | 13.555 |
1.9087 | 7.0 | 1204 | 2.2855 | {'rouge1': 0.4236, 'rouge2': 0.1858, 'rougeL': 0.1895, 'rougeLsum': 0.1895} | 0.8774 | -0.87 | 13.255 |
1.868 | 8.0 | 1376 | 2.2795 | {'rouge1': 0.4298, 'rouge2': 0.1896, 'rougeL': 0.1956, 'rougeLsum': 0.1956} | 0.877 | -0.8837 | 13.65 |
1.8063 | 9.0 | 1548 | 2.2802 | {'rouge1': 0.4427, 'rouge2': 0.1965, 'rougeL': 0.2011, 'rougeLsum': 0.2011} | 0.8779 | -0.8358 | 13.965 |
1.7161 | 10.0 | 1720 | 2.2685 | {'rouge1': 0.4146, 'rouge2': 0.1828, 'rougeL': 0.1918, 'rougeLsum': 0.1918} | 0.8795 | -0.8725 | 13.155 |
1.7027 | 11.0 | 1892 | 2.2824 | {'rouge1': 0.423, 'rouge2': 0.1871, 'rougeL': 0.1958, 'rougeLsum': 0.1958} | 0.8781 | -0.8476 | 13.49 |
1.6575 | 12.0 | 2064 | 2.2888 | {'rouge1': 0.4231, 'rouge2': 0.1847, 'rougeL': 0.1939, 'rougeLsum': 0.1939} | 0.878 | -0.8648 | 13.3 |
1.6046 | 13.0 | 2236 | 2.2946 | {'rouge1': 0.4387, 'rouge2': 0.1942, 'rougeL': 0.1987, 'rougeLsum': 0.1987} | 0.8771 | -0.8336 | 13.835 |
1.5638 | 14.0 | 2408 | 2.2961 | {'rouge1': 0.4225, 'rouge2': 0.1864, 'rougeL': 0.1973, 'rougeLsum': 0.1973} | 0.8774 | -0.8456 | 13.345 |
1.6015 | 15.0 | 2580 | 2.2937 | {'rouge1': 0.429, 'rouge2': 0.1947, 'rougeL': 0.2007, 'rougeLsum': 0.2007} | 0.8777 | -0.8402 | 13.655 |
1.5146 | 16.0 | 2752 | 2.3077 | {'rouge1': 0.4208, 'rouge2': 0.1869, 'rougeL': 0.1978, 'rougeLsum': 0.1978} | 0.8751 | -0.8221 | 13.695 |
1.5421 | 17.0 | 2924 | 2.3094 | {'rouge1': 0.4263, 'rouge2': 0.1938, 'rougeL': 0.202, 'rougeLsum': 0.202} | 0.8759 | -0.8207 | 13.67 |
1.5328 | 18.0 | 3096 | 2.3114 | {'rouge1': 0.4306, 'rouge2': 0.1927, 'rougeL': 0.2006, 'rougeLsum': 0.2006} | 0.8758 | -0.8284 | 13.755 |
1.5181 | 19.0 | 3268 | 2.3128 | {'rouge1': 0.4298, 'rouge2': 0.196, 'rougeL': 0.1997, 'rougeLsum': 0.1997} | 0.8764 | -0.8211 | 13.77 |
1.4926 | 20.0 | 3440 | 2.3132 | {'rouge1': 0.426, 'rouge2': 0.195, 'rougeL': 0.2024, 'rougeLsum': 0.2024} | 0.877 | -0.8149 | 13.66 |
Framework versions
- Transformers 4.35.2
- Pytorch 2.1.0+cu121
- Datasets 2.16.1
- Tokenizers 0.15.0
- Downloads last month
- 10
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
Model tree for sarahahtee/t5-summarization-zero-shot-headers-and-better-prompt-enriched
Base model
google/flan-t5-small