t5-summarization-one-shot-better-prompt-enriched

This model is a fine-tuned version of google/flan-t5-small on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 2.6565
Rouge: {'rouge1': 41.182, 'rouge2': 19.5265, 'rougeL': 18.933, 'rougeLsum': 18.933}
Bert Score: 0.8721
Bleurt 20: -0.8432
Gen Len: 13.495

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 7
eval_batch_size: 7
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 20

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge	Bert Score	Bleurt 20	Gen Len
1.0658	1.0	172	2.5439	{'rouge1': 42.7473, 'rouge2': 18.8594, 'rougeL': 19.0931, 'rougeLsum': 19.0931}	0.875	-0.8246	13.475
1.0299	2.0	344	2.6081	{'rouge1': 41.8654, 'rouge2': 18.8191, 'rougeL': 19.2938, 'rougeLsum': 19.2938}	0.8749	-0.8044	13.59
0.9696	3.0	516	2.6143	{'rouge1': 42.028, 'rouge2': 19.0578, 'rougeL': 19.2662, 'rougeLsum': 19.2662}	0.8742	-0.8329	13.41
1.2399	4.0	688	2.4862	{'rouge1': 45.2998, 'rouge2': 20.0367, 'rougeL': 19.5339, 'rougeLsum': 19.5339}	0.8767	-0.8051	14.13
1.1463	5.0	860	2.5147	{'rouge1': 42.185, 'rouge2': 18.7982, 'rougeL': 19.3514, 'rougeLsum': 19.3514}	0.8748	-0.8158	13.9
1.1535	6.0	1032	2.5190	{'rouge1': 42.0433, 'rouge2': 18.2183, 'rougeL': 19.096, 'rougeLsum': 19.096}	0.8736	-0.8365	13.245
1.105	7.0	1204	2.5545	{'rouge1': 42.6564, 'rouge2': 18.934, 'rougeL': 19.1676, 'rougeLsum': 19.1676}	0.8741	-0.8367	13.82
1.0948	8.0	1376	2.5909	{'rouge1': 44.3364, 'rouge2': 19.3218, 'rougeL': 20.0526, 'rougeLsum': 20.0526}	0.8756	-0.8175	14.03
1.073	9.0	1548	2.5995	{'rouge1': 43.7072, 'rouge2': 19.3837, 'rougeL': 19.3786, 'rougeLsum': 19.3786}	0.8744	-0.8409	14.035
1.0301	10.0	1720	2.5730	{'rouge1': 42.6338, 'rouge2': 19.083, 'rougeL': 19.0249, 'rougeLsum': 19.0249}	0.8737	-0.8464	13.695
1.0127	11.0	1892	2.6209	{'rouge1': 41.7565, 'rouge2': 18.5013, 'rougeL': 18.7625, 'rougeLsum': 18.7625}	0.8728	-0.8639	13.55
1.0267	12.0	2064	2.6467	{'rouge1': 43.4656, 'rouge2': 19.6808, 'rougeL': 19.491, 'rougeLsum': 19.491}	0.8736	-0.8258	13.985
0.9901	13.0	2236	2.6401	{'rouge1': 42.9025, 'rouge2': 20.2914, 'rougeL': 19.571, 'rougeLsum': 19.571}	0.8738	-0.8341	13.91
0.9766	14.0	2408	2.6614	{'rouge1': 42.9328, 'rouge2': 19.4599, 'rougeL': 19.6136, 'rougeLsum': 19.6136}	0.8745	-0.8085	13.855
1.0146	15.0	2580	2.6511	{'rouge1': 42.2846, 'rouge2': 19.2036, 'rougeL': 19.0654, 'rougeLsum': 19.0654}	0.8741	-0.8262	13.565
0.9757	16.0	2752	2.6493	{'rouge1': 42.1794, 'rouge2': 19.3274, 'rougeL': 18.711, 'rougeLsum': 18.711}	0.8717	-0.8531	13.785
1.0131	17.0	2924	2.6542	{'rouge1': 42.8968, 'rouge2': 19.6167, 'rougeL': 19.3472, 'rougeLsum': 19.3472}	0.8731	-0.8309	13.895
1.0183	18.0	3096	2.6541	{'rouge1': 42.2663, 'rouge2': 19.5557, 'rougeL': 19.3909, 'rougeLsum': 19.3909}	0.8726	-0.8318	13.66
1.0028	19.0	3268	2.6581	{'rouge1': 41.5487, 'rouge2': 19.7115, 'rougeL': 19.279, 'rougeLsum': 19.279}	0.8727	-0.8381	13.56
1.0046	20.0	3440	2.6565	{'rouge1': 41.182, 'rouge2': 19.5265, 'rougeL': 18.933, 'rougeLsum': 18.933}	0.8721	-0.8432	13.495

Framework versions

Transformers 4.35.2
Pytorch 2.1.0+cu121
Datasets 2.16.1
Tokenizers 0.15.0

sarahahtee
/

t5-summarization-one-shot-better-prompt-enriched

t5-summarization-one-shot-better-prompt-enriched

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for sarahahtee/t5-summarization-one-shot-better-prompt-enriched

Evaluation results