flan-t5-rouge-squad-qg-120d
This model is a fine-tuned version of google/flan-t5-base on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.1223
- Rouge1: 0.5056
- Rouge2: 0.2117
- Rougel: 0.4942
- Rougelsum: 0.4985
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 3e-05
- train_batch_size: 24
- eval_batch_size: 24
- seed: 42
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 120
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum |
---|---|---|---|---|---|---|---|
32.2863 | 1.0 | 3 | 42.3780 | 0.2032 | 0.1182 | 0.2023 | 0.2032 |
32.7873 | 2.0 | 6 | 38.0894 | 0.2032 | 0.1182 | 0.2023 | 0.2032 |
28.1341 | 3.0 | 9 | 34.0193 | 0.2032 | 0.1182 | 0.2023 | 0.2032 |
25.7833 | 4.0 | 12 | 30.4463 | 0.2032 | 0.1182 | 0.2023 | 0.2032 |
22.2089 | 5.0 | 15 | 27.4840 | 0.2032 | 0.1182 | 0.2023 | 0.2032 |
22.6191 | 6.0 | 18 | 24.9990 | 0.2032 | 0.1182 | 0.2023 | 0.2032 |
19.4952 | 7.0 | 21 | 22.7142 | 0.2032 | 0.1182 | 0.2023 | 0.2032 |
17.1759 | 8.0 | 24 | 20.4355 | 0.2032 | 0.1182 | 0.2023 | 0.2032 |
16.223 | 9.0 | 27 | 18.0152 | 0.2032 | 0.1182 | 0.2023 | 0.2032 |
15.1072 | 10.0 | 30 | 15.2946 | 0.2032 | 0.1182 | 0.2023 | 0.2032 |
12.9231 | 11.0 | 33 | 12.3366 | 0.2032 | 0.1182 | 0.2023 | 0.2032 |
9.6196 | 12.0 | 36 | 8.8958 | 0.2032 | 0.1182 | 0.2023 | 0.2032 |
6.9818 | 13.0 | 39 | 5.8776 | 0.2915 | 0.2206 | 0.2908 | 0.2912 |
5.2532 | 14.0 | 42 | 4.8059 | 0.2915 | 0.2206 | 0.2908 | 0.2912 |
4.8358 | 15.0 | 45 | 4.5587 | 0.2915 | 0.2206 | 0.2908 | 0.2912 |
4.7146 | 16.0 | 48 | 4.4341 | 0.2915 | 0.2206 | 0.2908 | 0.2912 |
4.5485 | 17.0 | 51 | 4.3365 | 0.2915 | 0.2206 | 0.2908 | 0.2912 |
4.4082 | 18.0 | 54 | 4.2401 | 0.2915 | 0.2206 | 0.2908 | 0.2912 |
4.3961 | 19.0 | 57 | 4.1345 | 0.2915 | 0.2206 | 0.2908 | 0.2912 |
3.9847 | 20.0 | 60 | 4.0156 | 0.2915 | 0.2206 | 0.2908 | 0.2912 |
4.0299 | 21.0 | 63 | 3.8792 | 0.2915 | 0.2206 | 0.2908 | 0.2912 |
3.9243 | 22.0 | 66 | 3.7194 | 0.2915 | 0.2206 | 0.2908 | 0.2912 |
4.0614 | 23.0 | 69 | 3.5281 | 0.2915 | 0.2206 | 0.2908 | 0.2912 |
3.6469 | 24.0 | 72 | 3.2865 | 0.2915 | 0.2206 | 0.2908 | 0.2912 |
3.5482 | 25.0 | 75 | 2.9752 | 0.2915 | 0.2206 | 0.2908 | 0.2912 |
3.8188 | 26.0 | 78 | 2.6041 | 0.2915 | 0.2206 | 0.2908 | 0.2912 |
3.2782 | 27.0 | 81 | 2.2994 | 0.2915 | 0.2206 | 0.2908 | 0.2912 |
2.9194 | 28.0 | 84 | 2.2774 | 0.2915 | 0.2206 | 0.2908 | 0.2912 |
2.9811 | 29.0 | 87 | 2.3933 | 0.2915 | 0.2206 | 0.2908 | 0.2912 |
2.5988 | 30.0 | 90 | 2.3804 | 0.2915 | 0.2206 | 0.2908 | 0.2912 |
2.8451 | 31.0 | 93 | 2.1948 | 0.2915 | 0.2206 | 0.2908 | 0.2912 |
2.5427 | 32.0 | 96 | 1.8549 | 0.2915 | 0.2206 | 0.2908 | 0.2912 |
2.3422 | 33.0 | 99 | 1.5327 | 0.2915 | 0.2206 | 0.2908 | 0.2912 |
2.6946 | 34.0 | 102 | 1.3433 | 0.2915 | 0.2206 | 0.2908 | 0.2912 |
2.1959 | 35.0 | 105 | 1.2307 | 0.2915 | 0.2206 | 0.2908 | 0.2912 |
2.08 | 36.0 | 108 | 1.1113 | 0.2915 | 0.2206 | 0.2908 | 0.2912 |
2.0729 | 37.0 | 111 | 1.0123 | 0.2915 | 0.2206 | 0.2908 | 0.2912 |
1.9172 | 38.0 | 114 | 0.9357 | 0.2915 | 0.2206 | 0.2908 | 0.2912 |
1.9083 | 39.0 | 117 | 0.8667 | 0.2915 | 0.2206 | 0.2908 | 0.2912 |
1.7726 | 40.0 | 120 | 0.8094 | 0.2915 | 0.2206 | 0.2908 | 0.2912 |
1.5694 | 41.0 | 123 | 0.7332 | 0.2915 | 0.2206 | 0.2908 | 0.2912 |
1.4749 | 42.0 | 126 | 0.6772 | 0.2915 | 0.2206 | 0.2908 | 0.2912 |
1.2945 | 43.0 | 129 | 0.6321 | 0.2915 | 0.2206 | 0.2908 | 0.2912 |
1.5025 | 44.0 | 132 | 0.5897 | 0.2915 | 0.2206 | 0.2908 | 0.2912 |
1.0424 | 45.0 | 135 | 0.5535 | 0.2915 | 0.2206 | 0.2908 | 0.2912 |
1.2167 | 46.0 | 138 | 0.5190 | 0.2915 | 0.2206 | 0.2908 | 0.2912 |
1.3 | 47.0 | 141 | 0.4821 | 0.2915 | 0.2206 | 0.2908 | 0.2912 |
0.996 | 48.0 | 144 | 0.4602 | 0.2915 | 0.2206 | 0.2908 | 0.2912 |
1.1782 | 49.0 | 147 | 0.4337 | 0.2915 | 0.2206 | 0.2908 | 0.2912 |
0.9097 | 50.0 | 150 | 0.4124 | 0.2915 | 0.2206 | 0.2908 | 0.2912 |
0.9224 | 51.0 | 153 | 0.3861 | 0.2915 | 0.2206 | 0.2908 | 0.2912 |
0.8868 | 52.0 | 156 | 0.3642 | 0.2915 | 0.2206 | 0.2908 | 0.2912 |
1.1489 | 53.0 | 159 | 0.3529 | 0.2915 | 0.2206 | 0.2908 | 0.2912 |
0.8377 | 54.0 | 162 | 0.3435 | 0.2915 | 0.2206 | 0.2908 | 0.2912 |
0.6534 | 55.0 | 165 | 0.3315 | 0.2915 | 0.2206 | 0.2908 | 0.2912 |
0.6776 | 56.0 | 168 | 0.3173 | 0.2915 | 0.2206 | 0.2908 | 0.2912 |
0.8189 | 57.0 | 171 | 0.3078 | 0.2915 | 0.2206 | 0.2908 | 0.2912 |
0.732 | 58.0 | 174 | 0.2974 | 0.2915 | 0.2206 | 0.2908 | 0.2912 |
0.6816 | 59.0 | 177 | 0.2839 | 0.2915 | 0.2206 | 0.2908 | 0.2912 |
0.6722 | 60.0 | 180 | 0.2704 | 0.2915 | 0.2206 | 0.2908 | 0.2912 |
0.5579 | 61.0 | 183 | 0.2587 | 0.2915 | 0.2206 | 0.2908 | 0.2912 |
0.6278 | 62.0 | 186 | 0.2478 | 0.2915 | 0.2206 | 0.2908 | 0.2912 |
0.5919 | 63.0 | 189 | 0.2366 | 0.2915 | 0.2206 | 0.2908 | 0.2912 |
0.6178 | 64.0 | 192 | 0.2298 | 0.2915 | 0.2206 | 0.2908 | 0.2912 |
0.5226 | 65.0 | 195 | 0.2203 | 0.2915 | 0.2206 | 0.2908 | 0.2912 |
0.6231 | 66.0 | 198 | 0.2137 | 0.2915 | 0.2206 | 0.2908 | 0.2912 |
0.8007 | 67.0 | 201 | 0.2095 | 0.2915 | 0.2206 | 0.2908 | 0.2912 |
0.695 | 68.0 | 204 | 0.2042 | 0.2915 | 0.2206 | 0.2908 | 0.2912 |
0.3889 | 69.0 | 207 | 0.2000 | 0.2915 | 0.2206 | 0.2908 | 0.2912 |
0.566 | 70.0 | 210 | 0.1960 | 0.2915 | 0.2206 | 0.2908 | 0.2912 |
0.4185 | 71.0 | 213 | 0.1909 | 0.2915 | 0.2206 | 0.2908 | 0.2912 |
0.3276 | 72.0 | 216 | 0.1865 | 0.2915 | 0.2206 | 0.2908 | 0.2912 |
0.4881 | 73.0 | 219 | 0.1844 | 0.2915 | 0.2206 | 0.2908 | 0.2912 |
0.4183 | 74.0 | 222 | 0.1826 | 0.2915 | 0.2206 | 0.2908 | 0.2912 |
0.3544 | 75.0 | 225 | 0.1794 | 0.2915 | 0.2206 | 0.2908 | 0.2912 |
0.401 | 76.0 | 228 | 0.1740 | 0.2915 | 0.2206 | 0.2908 | 0.2912 |
0.3084 | 77.0 | 231 | 0.1677 | 0.2915 | 0.2206 | 0.2908 | 0.2912 |
0.4357 | 78.0 | 234 | 0.1621 | 0.5438 | 0.3059 | 0.5384 | 0.5377 |
0.444 | 79.0 | 237 | 0.1574 | 0.5438 | 0.3059 | 0.5384 | 0.5377 |
0.2813 | 80.0 | 240 | 0.1535 | 0.5438 | 0.3059 | 0.5384 | 0.5377 |
0.4731 | 81.0 | 243 | 0.1500 | 0.5438 | 0.3059 | 0.5384 | 0.5377 |
0.3088 | 82.0 | 246 | 0.1458 | 0.5438 | 0.3059 | 0.5384 | 0.5377 |
15.4329 | 83.0 | 249 | 0.1417 | 0.5438 | 0.3059 | 0.5384 | 0.5377 |
0.2968 | 84.0 | 252 | 0.1393 | 0.5438 | 0.3059 | 0.5384 | 0.5377 |
0.4953 | 85.0 | 255 | 0.1375 | 0.5438 | 0.3059 | 0.5384 | 0.5377 |
0.613 | 86.0 | 258 | 0.1366 | 0.5438 | 0.3059 | 0.5384 | 0.5377 |
0.2699 | 87.0 | 261 | 0.1362 | 0.5438 | 0.3059 | 0.5384 | 0.5377 |
0.6348 | 88.0 | 264 | 0.1359 | 0.5438 | 0.3059 | 0.5384 | 0.5377 |
0.2458 | 89.0 | 267 | 0.1355 | 0.5438 | 0.3059 | 0.5384 | 0.5377 |
0.3118 | 90.0 | 270 | 0.1346 | 0.5438 | 0.3059 | 0.5384 | 0.5377 |
0.226 | 91.0 | 273 | 0.1338 | 0.5056 | 0.2117 | 0.4942 | 0.4985 |
0.2703 | 92.0 | 276 | 0.1331 | 0.5056 | 0.2117 | 0.4942 | 0.4985 |
0.2133 | 93.0 | 279 | 0.1326 | 0.5056 | 0.2117 | 0.4942 | 0.4985 |
0.2071 | 94.0 | 282 | 0.1319 | 0.5056 | 0.2117 | 0.4942 | 0.4985 |
0.2706 | 95.0 | 285 | 0.1311 | 0.5056 | 0.2117 | 0.4942 | 0.4985 |
0.2638 | 96.0 | 288 | 0.1301 | 0.5056 | 0.2117 | 0.4942 | 0.4985 |
0.1659 | 97.0 | 291 | 0.1292 | 0.5056 | 0.2117 | 0.4942 | 0.4985 |
0.3739 | 98.0 | 294 | 0.1282 | 0.5056 | 0.2117 | 0.4942 | 0.4985 |
0.3095 | 99.0 | 297 | 0.1272 | 0.5056 | 0.2117 | 0.4942 | 0.4985 |
0.5029 | 100.0 | 300 | 0.1263 | 0.5056 | 0.2117 | 0.4942 | 0.4985 |
0.3424 | 101.0 | 303 | 0.1257 | 0.5056 | 0.2117 | 0.4942 | 0.4985 |
0.2365 | 102.0 | 306 | 0.1254 | 0.5056 | 0.2117 | 0.4942 | 0.4985 |
0.3384 | 103.0 | 309 | 0.1251 | 0.5056 | 0.2117 | 0.4942 | 0.4985 |
0.3175 | 104.0 | 312 | 0.1249 | 0.5056 | 0.2117 | 0.4942 | 0.4985 |
0.1915 | 105.0 | 315 | 0.1246 | 0.5056 | 0.2117 | 0.4942 | 0.4985 |
0.307 | 106.0 | 318 | 0.1243 | 0.5056 | 0.2117 | 0.4942 | 0.4985 |
0.3335 | 107.0 | 321 | 0.1239 | 0.5056 | 0.2117 | 0.4942 | 0.4985 |
0.2385 | 108.0 | 324 | 0.1236 | 0.5056 | 0.2117 | 0.4942 | 0.4985 |
0.2547 | 109.0 | 327 | 0.1233 | 0.5056 | 0.2117 | 0.4942 | 0.4985 |
0.415 | 110.0 | 330 | 0.1231 | 0.5056 | 0.2117 | 0.4942 | 0.4985 |
0.1968 | 111.0 | 333 | 0.1229 | 0.5056 | 0.2117 | 0.4942 | 0.4985 |
0.3445 | 112.0 | 336 | 0.1228 | 0.5056 | 0.2117 | 0.4942 | 0.4985 |
0.1515 | 113.0 | 339 | 0.1226 | 0.5056 | 0.2117 | 0.4942 | 0.4985 |
0.1373 | 114.0 | 342 | 0.1226 | 0.5056 | 0.2117 | 0.4942 | 0.4985 |
0.2897 | 115.0 | 345 | 0.1226 | 0.5056 | 0.2117 | 0.4942 | 0.4985 |
0.2362 | 116.0 | 348 | 0.1226 | 0.5056 | 0.2117 | 0.4942 | 0.4985 |
0.2475 | 117.0 | 351 | 0.1225 | 0.5056 | 0.2117 | 0.4942 | 0.4985 |
0.2224 | 118.0 | 354 | 0.1224 | 0.5056 | 0.2117 | 0.4942 | 0.4985 |
0.2224 | 119.0 | 357 | 0.1223 | 0.5056 | 0.2117 | 0.4942 | 0.4985 |
0.3743 | 120.0 | 360 | 0.1223 | 0.5056 | 0.2117 | 0.4942 | 0.4985 |
Framework versions
- Transformers 4.47.1
- Pytorch 2.5.1+cu121
- Datasets 3.2.0
- Tokenizers 0.21.0
- Downloads last month
- 103
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for devagonal/flan-t5-rouge-squad-qg-120d
Base model
google/flan-t5-base