flan-t5-rouge-squad-qg-120
This model is a fine-tuned version of google/flan-t5-base on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.2047
- Rouge1: 0.3778
- Rouge2: 0.1232
- Rougel: 0.3460
- Rougelsum: 0.3447
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0003
- train_batch_size: 24
- eval_batch_size: 24
- seed: 42
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 120
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum |
---|---|---|---|---|---|---|---|
21.4853 | 1.0 | 3 | 20.7247 | 0.2045 | 0.1177 | 0.2042 | 0.2036 |
10.7565 | 2.0 | 6 | 5.6408 | 0.2934 | 0.2204 | 0.2927 | 0.2929 |
4.4318 | 3.0 | 9 | 4.3101 | 0.2527 | 0.1164 | 0.2023 | 0.2050 |
4.0278 | 4.0 | 12 | 3.8106 | 0.2934 | 0.2204 | 0.2927 | 0.2929 |
3.3203 | 5.0 | 15 | 2.8678 | 0.2934 | 0.2204 | 0.2927 | 0.2929 |
2.2918 | 6.0 | 18 | 1.2277 | 0.2934 | 0.2204 | 0.2927 | 0.2929 |
1.5238 | 7.0 | 21 | 0.7362 | 0.2934 | 0.2204 | 0.2927 | 0.2929 |
1.0426 | 8.0 | 24 | 0.3962 | 0.2934 | 0.2204 | 0.2927 | 0.2929 |
0.593 | 9.0 | 27 | 0.2734 | 0.2934 | 0.2204 | 0.2927 | 0.2929 |
0.4585 | 10.0 | 30 | 0.2091 | 0.2934 | 0.2204 | 0.2927 | 0.2929 |
0.3537 | 11.0 | 33 | 0.1685 | 0.2605 | 0.1185 | 0.2016 | 0.2043 |
0.1138 | 12.0 | 36 | 0.1311 | 0.4951 | 0.1967 | 0.4704 | 0.4876 |
0.1047 | 13.0 | 39 | 0.1180 | 0.4712 | 0.1555 | 0.4095 | 0.4480 |
0.2469 | 14.0 | 42 | 0.1190 | 0.4951 | 0.1967 | 0.4704 | 0.4876 |
0.0782 | 15.0 | 45 | 0.1245 | 0.4405 | 0.1357 | 0.4204 | 0.4368 |
0.2755 | 16.0 | 48 | 0.1303 | 0.4315 | 0.0735 | 0.3177 | 0.3818 |
0.1193 | 17.0 | 51 | 0.1342 | 0.4315 | 0.0735 | 0.3177 | 0.3818 |
0.0934 | 18.0 | 54 | 0.1348 | 0.4524 | 0.0987 | 0.3327 | 0.4066 |
0.0928 | 19.0 | 57 | 0.1352 | 0.4524 | 0.0987 | 0.3327 | 0.4066 |
0.0683 | 20.0 | 60 | 0.1395 | 0.4024 | 0.0945 | 0.3315 | 0.3585 |
0.0776 | 21.0 | 63 | 0.1469 | 0.4712 | 0.1555 | 0.4095 | 0.4480 |
0.1416 | 22.0 | 66 | 0.1507 | 0.2259 | 0.0923 | 0.2173 | 0.2230 |
0.152 | 23.0 | 69 | 0.1488 | 0.4712 | 0.1555 | 0.4095 | 0.4480 |
0.0531 | 24.0 | 72 | 0.1473 | 0.3297 | 0.0912 | 0.2907 | 0.3103 |
0.0311 | 25.0 | 75 | 0.1488 | 0.4315 | 0.0735 | 0.3177 | 0.3818 |
0.0782 | 26.0 | 78 | 0.1527 | 0.4315 | 0.0735 | 0.3177 | 0.3818 |
0.0339 | 27.0 | 81 | 0.1591 | 0.4315 | 0.0735 | 0.3177 | 0.3818 |
0.0287 | 28.0 | 84 | 0.1695 | 0.4315 | 0.0735 | 0.3177 | 0.3818 |
0.0149 | 29.0 | 87 | 0.1806 | 0.4315 | 0.0735 | 0.3177 | 0.3818 |
0.0117 | 30.0 | 90 | 0.1886 | 0.4315 | 0.0735 | 0.3177 | 0.3818 |
0.0179 | 31.0 | 93 | 0.1934 | 0.2847 | 0.0914 | 0.2804 | 0.2809 |
0.0533 | 32.0 | 96 | 0.1932 | 0.2847 | 0.0914 | 0.2804 | 0.2809 |
0.0283 | 33.0 | 99 | 0.1883 | 0.2847 | 0.0914 | 0.2804 | 0.2809 |
0.0177 | 34.0 | 102 | 0.1875 | 0.2847 | 0.0914 | 0.2804 | 0.2809 |
0.027 | 35.0 | 105 | 0.1902 | 0.2847 | 0.0914 | 0.2804 | 0.2809 |
0.0377 | 36.0 | 108 | 0.1938 | 0.2613 | 0.0667 | 0.2499 | 0.2578 |
0.0409 | 37.0 | 111 | 0.1960 | 0.2613 | 0.0667 | 0.2499 | 0.2578 |
0.026 | 38.0 | 114 | 0.1957 | 0.2921 | 0.0667 | 0.2752 | 0.2881 |
0.0274 | 39.0 | 117 | 0.1959 | 0.2921 | 0.0667 | 0.2752 | 0.2881 |
0.0329 | 40.0 | 120 | 0.1972 | 0.2921 | 0.0667 | 0.2752 | 0.2881 |
0.0316 | 41.0 | 123 | 0.1917 | 0.2921 | 0.0667 | 0.2752 | 0.2881 |
0.0297 | 42.0 | 126 | 0.1855 | 0.2207 | 0.0899 | 0.2124 | 0.2174 |
0.0182 | 43.0 | 129 | 0.1845 | 0.2207 | 0.0899 | 0.2124 | 0.2174 |
0.027 | 44.0 | 132 | 0.1842 | 0.2207 | 0.0899 | 0.2124 | 0.2174 |
0.0256 | 45.0 | 135 | 0.1851 | 0.2207 | 0.0899 | 0.2124 | 0.2174 |
0.0615 | 46.0 | 138 | 0.1856 | 0.2207 | 0.0899 | 0.2124 | 0.2174 |
0.0276 | 47.0 | 141 | 0.1865 | 0.2207 | 0.0899 | 0.2124 | 0.2174 |
0.0208 | 48.0 | 144 | 0.1884 | 0.2207 | 0.0899 | 0.2124 | 0.2174 |
0.0333 | 49.0 | 147 | 0.1906 | 0.3588 | 0.0677 | 0.3030 | 0.3287 |
0.0224 | 50.0 | 150 | 0.1920 | 0.3817 | 0.0899 | 0.3278 | 0.3652 |
0.0228 | 51.0 | 153 | 0.1930 | 0.3817 | 0.0899 | 0.3278 | 0.3652 |
0.0196 | 52.0 | 156 | 0.1967 | 0.2694 | 0.0981 | 0.2638 | 0.2652 |
0.0329 | 53.0 | 159 | 0.2023 | 0.2694 | 0.0981 | 0.2638 | 0.2652 |
0.0366 | 54.0 | 162 | 0.2037 | 0.2694 | 0.0981 | 0.2638 | 0.2652 |
0.0183 | 55.0 | 165 | 0.2009 | 0.2694 | 0.0981 | 0.2638 | 0.2652 |
0.0426 | 56.0 | 168 | 0.1989 | 0.2694 | 0.0981 | 0.2638 | 0.2652 |
0.0339 | 57.0 | 171 | 0.1964 | 0.4122 | 0.1584 | 0.3901 | 0.4045 |
0.0184 | 58.0 | 174 | 0.1947 | 0.4122 | 0.1584 | 0.3901 | 0.4045 |
0.0296 | 59.0 | 177 | 0.1935 | 0.4122 | 0.1584 | 0.3901 | 0.4045 |
0.0314 | 60.0 | 180 | 0.1928 | 0.3778 | 0.1232 | 0.3460 | 0.3447 |
0.0168 | 61.0 | 183 | 0.1924 | 0.3778 | 0.1232 | 0.3460 | 0.3447 |
0.0174 | 62.0 | 186 | 0.1931 | 0.3778 | 0.1232 | 0.3460 | 0.3447 |
0.0237 | 63.0 | 189 | 0.1944 | 0.3778 | 0.1232 | 0.3460 | 0.3447 |
0.0153 | 64.0 | 192 | 0.1959 | 0.3778 | 0.1232 | 0.3460 | 0.3447 |
0.0395 | 65.0 | 195 | 0.1952 | 0.3778 | 0.1232 | 0.3460 | 0.3447 |
0.0329 | 66.0 | 198 | 0.1910 | 0.3778 | 0.1232 | 0.3460 | 0.3447 |
0.0203 | 67.0 | 201 | 0.1879 | 0.3778 | 0.1232 | 0.3460 | 0.3447 |
0.0305 | 68.0 | 204 | 0.1875 | 0.3778 | 0.1232 | 0.3460 | 0.3447 |
0.0184 | 69.0 | 207 | 0.1878 | 0.3778 | 0.1232 | 0.3460 | 0.3447 |
0.0178 | 70.0 | 210 | 0.1894 | 0.3778 | 0.1232 | 0.3460 | 0.3447 |
0.0145 | 71.0 | 213 | 0.1921 | 0.3778 | 0.1232 | 0.3460 | 0.3447 |
0.0158 | 72.0 | 216 | 0.1952 | 0.3778 | 0.1232 | 0.3460 | 0.3447 |
0.0148 | 73.0 | 219 | 0.1970 | 0.3778 | 0.1232 | 0.3460 | 0.3447 |
0.0171 | 74.0 | 222 | 0.1981 | 0.3778 | 0.1232 | 0.3460 | 0.3447 |
0.0189 | 75.0 | 225 | 0.1990 | 0.3014 | 0.0638 | 0.2775 | 0.2777 |
0.0116 | 76.0 | 228 | 0.2000 | 0.3014 | 0.0638 | 0.2775 | 0.2777 |
0.0107 | 77.0 | 231 | 0.2016 | 0.3778 | 0.1232 | 0.3460 | 0.3447 |
0.0166 | 78.0 | 234 | 0.2029 | 0.3778 | 0.1232 | 0.3460 | 0.3447 |
0.011 | 79.0 | 237 | 0.2042 | 0.3778 | 0.1232 | 0.3460 | 0.3447 |
0.0128 | 80.0 | 240 | 0.2061 | 0.3778 | 0.1232 | 0.3460 | 0.3447 |
0.0312 | 81.0 | 243 | 0.2064 | 0.3778 | 0.1232 | 0.3460 | 0.3447 |
0.0156 | 82.0 | 246 | 0.2052 | 0.3778 | 0.1232 | 0.3460 | 0.3447 |
0.0227 | 83.0 | 249 | 0.2049 | 0.3778 | 0.1232 | 0.3460 | 0.3447 |
0.0221 | 84.0 | 252 | 0.2049 | 0.4122 | 0.1584 | 0.3901 | 0.4045 |
0.0147 | 85.0 | 255 | 0.2054 | 0.4122 | 0.1584 | 0.3901 | 0.4045 |
0.0161 | 86.0 | 258 | 0.2065 | 0.4122 | 0.1584 | 0.3901 | 0.4045 |
0.0335 | 87.0 | 261 | 0.2077 | 0.4122 | 0.1584 | 0.3901 | 0.4045 |
0.0177 | 88.0 | 264 | 0.2087 | 0.4122 | 0.1584 | 0.3901 | 0.4045 |
0.0147 | 89.0 | 267 | 0.2096 | 0.4122 | 0.1584 | 0.3901 | 0.4045 |
0.02 | 90.0 | 270 | 0.2102 | 0.4122 | 0.1584 | 0.3901 | 0.4045 |
0.0245 | 91.0 | 273 | 0.2112 | 0.4122 | 0.1584 | 0.3901 | 0.4045 |
0.0163 | 92.0 | 276 | 0.2129 | 0.4122 | 0.1584 | 0.3901 | 0.4045 |
0.0131 | 93.0 | 279 | 0.2144 | 0.4122 | 0.1584 | 0.3901 | 0.4045 |
0.0167 | 94.0 | 282 | 0.2156 | 0.4122 | 0.1584 | 0.3901 | 0.4045 |
0.0089 | 95.0 | 285 | 0.2166 | 0.4122 | 0.1584 | 0.3901 | 0.4045 |
0.0483 | 96.0 | 288 | 0.2162 | 0.3778 | 0.1232 | 0.3460 | 0.3447 |
0.024 | 97.0 | 291 | 0.2137 | 0.3778 | 0.1232 | 0.3460 | 0.3447 |
0.0185 | 98.0 | 294 | 0.2112 | 0.2709 | 0.0639 | 0.2565 | 0.2670 |
0.0126 | 99.0 | 297 | 0.2094 | 0.2709 | 0.0639 | 0.2565 | 0.2670 |
0.0244 | 100.0 | 300 | 0.2082 | 0.2709 | 0.0639 | 0.2565 | 0.2670 |
0.0157 | 101.0 | 303 | 0.2074 | 0.2709 | 0.0639 | 0.2565 | 0.2670 |
0.0128 | 102.0 | 306 | 0.2067 | 0.2709 | 0.0639 | 0.2565 | 0.2670 |
0.0287 | 103.0 | 309 | 0.2062 | 0.2709 | 0.0639 | 0.2565 | 0.2670 |
0.0365 | 104.0 | 312 | 0.2052 | 0.2709 | 0.0639 | 0.2565 | 0.2670 |
0.0116 | 105.0 | 315 | 0.2044 | 0.2709 | 0.0639 | 0.2565 | 0.2670 |
0.0092 | 106.0 | 318 | 0.2038 | 0.2709 | 0.0639 | 0.2565 | 0.2670 |
0.023 | 107.0 | 321 | 0.2036 | 0.2709 | 0.0639 | 0.2565 | 0.2670 |
0.0316 | 108.0 | 324 | 0.2034 | 0.3778 | 0.1232 | 0.3460 | 0.3447 |
0.0186 | 109.0 | 327 | 0.2031 | 0.3778 | 0.1232 | 0.3460 | 0.3447 |
0.0225 | 110.0 | 330 | 0.2031 | 0.3778 | 0.1232 | 0.3460 | 0.3447 |
0.0183 | 111.0 | 333 | 0.2032 | 0.3778 | 0.1232 | 0.3460 | 0.3447 |
0.0122 | 112.0 | 336 | 0.2034 | 0.3778 | 0.1232 | 0.3460 | 0.3447 |
0.0147 | 113.0 | 339 | 0.2037 | 0.3778 | 0.1232 | 0.3460 | 0.3447 |
0.0176 | 114.0 | 342 | 0.2040 | 0.3778 | 0.1232 | 0.3460 | 0.3447 |
0.0227 | 115.0 | 345 | 0.2042 | 0.3778 | 0.1232 | 0.3460 | 0.3447 |
0.0144 | 116.0 | 348 | 0.2044 | 0.3778 | 0.1232 | 0.3460 | 0.3447 |
0.0151 | 117.0 | 351 | 0.2045 | 0.3778 | 0.1232 | 0.3460 | 0.3447 |
0.0172 | 118.0 | 354 | 0.2046 | 0.3778 | 0.1232 | 0.3460 | 0.3447 |
0.0157 | 119.0 | 357 | 0.2046 | 0.3778 | 0.1232 | 0.3460 | 0.3447 |
0.0329 | 120.0 | 360 | 0.2047 | 0.3778 | 0.1232 | 0.3460 | 0.3447 |
Framework versions
- Transformers 4.47.1
- Pytorch 2.5.1+cu121
- Datasets 3.2.0
- Tokenizers 0.21.0
- Downloads last month
- 3
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for devagonal/flan-t5-rouge-squad-qg-120
Base model
google/flan-t5-base