t5-summarization-one-shot-better-prompt-enriched
This model is a fine-tuned version of google/flan-t5-small on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 2.6565
- Rouge: {'rouge1': 41.182, 'rouge2': 19.5265, 'rougeL': 18.933, 'rougeLsum': 18.933}
- Bert Score: 0.8721
- Bleurt 20: -0.8432
- Gen Len: 13.495
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 7
- eval_batch_size: 7
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 20
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge | Bert Score | Bleurt 20 | Gen Len |
---|---|---|---|---|---|---|---|
1.0658 | 1.0 | 172 | 2.5439 | {'rouge1': 42.7473, 'rouge2': 18.8594, 'rougeL': 19.0931, 'rougeLsum': 19.0931} | 0.875 | -0.8246 | 13.475 |
1.0299 | 2.0 | 344 | 2.6081 | {'rouge1': 41.8654, 'rouge2': 18.8191, 'rougeL': 19.2938, 'rougeLsum': 19.2938} | 0.8749 | -0.8044 | 13.59 |
0.9696 | 3.0 | 516 | 2.6143 | {'rouge1': 42.028, 'rouge2': 19.0578, 'rougeL': 19.2662, 'rougeLsum': 19.2662} | 0.8742 | -0.8329 | 13.41 |
1.2399 | 4.0 | 688 | 2.4862 | {'rouge1': 45.2998, 'rouge2': 20.0367, 'rougeL': 19.5339, 'rougeLsum': 19.5339} | 0.8767 | -0.8051 | 14.13 |
1.1463 | 5.0 | 860 | 2.5147 | {'rouge1': 42.185, 'rouge2': 18.7982, 'rougeL': 19.3514, 'rougeLsum': 19.3514} | 0.8748 | -0.8158 | 13.9 |
1.1535 | 6.0 | 1032 | 2.5190 | {'rouge1': 42.0433, 'rouge2': 18.2183, 'rougeL': 19.096, 'rougeLsum': 19.096} | 0.8736 | -0.8365 | 13.245 |
1.105 | 7.0 | 1204 | 2.5545 | {'rouge1': 42.6564, 'rouge2': 18.934, 'rougeL': 19.1676, 'rougeLsum': 19.1676} | 0.8741 | -0.8367 | 13.82 |
1.0948 | 8.0 | 1376 | 2.5909 | {'rouge1': 44.3364, 'rouge2': 19.3218, 'rougeL': 20.0526, 'rougeLsum': 20.0526} | 0.8756 | -0.8175 | 14.03 |
1.073 | 9.0 | 1548 | 2.5995 | {'rouge1': 43.7072, 'rouge2': 19.3837, 'rougeL': 19.3786, 'rougeLsum': 19.3786} | 0.8744 | -0.8409 | 14.035 |
1.0301 | 10.0 | 1720 | 2.5730 | {'rouge1': 42.6338, 'rouge2': 19.083, 'rougeL': 19.0249, 'rougeLsum': 19.0249} | 0.8737 | -0.8464 | 13.695 |
1.0127 | 11.0 | 1892 | 2.6209 | {'rouge1': 41.7565, 'rouge2': 18.5013, 'rougeL': 18.7625, 'rougeLsum': 18.7625} | 0.8728 | -0.8639 | 13.55 |
1.0267 | 12.0 | 2064 | 2.6467 | {'rouge1': 43.4656, 'rouge2': 19.6808, 'rougeL': 19.491, 'rougeLsum': 19.491} | 0.8736 | -0.8258 | 13.985 |
0.9901 | 13.0 | 2236 | 2.6401 | {'rouge1': 42.9025, 'rouge2': 20.2914, 'rougeL': 19.571, 'rougeLsum': 19.571} | 0.8738 | -0.8341 | 13.91 |
0.9766 | 14.0 | 2408 | 2.6614 | {'rouge1': 42.9328, 'rouge2': 19.4599, 'rougeL': 19.6136, 'rougeLsum': 19.6136} | 0.8745 | -0.8085 | 13.855 |
1.0146 | 15.0 | 2580 | 2.6511 | {'rouge1': 42.2846, 'rouge2': 19.2036, 'rougeL': 19.0654, 'rougeLsum': 19.0654} | 0.8741 | -0.8262 | 13.565 |
0.9757 | 16.0 | 2752 | 2.6493 | {'rouge1': 42.1794, 'rouge2': 19.3274, 'rougeL': 18.711, 'rougeLsum': 18.711} | 0.8717 | -0.8531 | 13.785 |
1.0131 | 17.0 | 2924 | 2.6542 | {'rouge1': 42.8968, 'rouge2': 19.6167, 'rougeL': 19.3472, 'rougeLsum': 19.3472} | 0.8731 | -0.8309 | 13.895 |
1.0183 | 18.0 | 3096 | 2.6541 | {'rouge1': 42.2663, 'rouge2': 19.5557, 'rougeL': 19.3909, 'rougeLsum': 19.3909} | 0.8726 | -0.8318 | 13.66 |
1.0028 | 19.0 | 3268 | 2.6581 | {'rouge1': 41.5487, 'rouge2': 19.7115, 'rougeL': 19.279, 'rougeLsum': 19.279} | 0.8727 | -0.8381 | 13.56 |
1.0046 | 20.0 | 3440 | 2.6565 | {'rouge1': 41.182, 'rouge2': 19.5265, 'rougeL': 18.933, 'rougeLsum': 18.933} | 0.8721 | -0.8432 | 13.495 |
Framework versions
- Transformers 4.35.2
- Pytorch 2.1.0+cu121
- Datasets 2.16.1
- Tokenizers 0.15.0
- Downloads last month
- 99
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the model is not deployed on the HF Inference API.
Model tree for sarahahtee/t5-summarization-one-shot-better-prompt-enriched
Base model
google/flan-t5-small