File size: 15,325 Bytes
49264aa
 
33e88d5
 
 
 
 
 
 
 
49264aa
33e88d5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
---
license: apache-2.0
base_model: t5-small
tags:
- generated_from_trainer
metrics:
- bleu
model-index:
- name: tl-bic-model
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# tl-bic-model

This model is a fine-tuned version of [t5-small](https://huggingface.co/t5-small) on an unknown dataset.
It achieves the following results on the evaluation set:
- Loss: 0.0048
- Bleu: 9.1518
- Gen Len: 9.681

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.001
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 200
- mixed_precision_training: Native AMP

### Training results

| Training Loss | Epoch | Step | Validation Loss | Bleu   | Gen Len |
|:-------------:|:-----:|:----:|:---------------:|:------:|:-------:|
| No log        | 1.0   | 11   | 2.9595          | 0.2068 | 9.7301  |
| No log        | 2.0   | 22   | 2.5919          | 0.4412 | 10.0736 |
| No log        | 3.0   | 33   | 2.2077          | 0.9166 | 9.6626  |
| No log        | 4.0   | 44   | 1.9446          | 0.7991 | 9.8037  |
| No log        | 5.0   | 55   | 1.6666          | 0.8674 | 9.8221  |
| No log        | 6.0   | 66   | 1.4209          | 1.0262 | 10.0613 |
| No log        | 7.0   | 77   | 1.1828          | 1.573  | 9.9693  |
| No log        | 8.0   | 88   | 0.9715          | 1.6163 | 9.9509  |
| No log        | 9.0   | 99   | 0.8203          | 2.1844 | 9.7362  |
| No log        | 10.0  | 110  | 0.6698          | 2.193  | 9.6687  |
| No log        | 11.0  | 121  | 0.5533          | 3.1733 | 9.816   |
| No log        | 12.0  | 132  | 0.4650          | 3.0054 | 9.6687  |
| No log        | 13.0  | 143  | 0.3783          | 3.5488 | 9.6012  |
| No log        | 14.0  | 154  | 0.3130          | 4.1709 | 9.7362  |
| No log        | 15.0  | 165  | 0.2620          | 4.9365 | 9.6442  |
| No log        | 16.0  | 176  | 0.2351          | 5.5276 | 9.546   |
| No log        | 17.0  | 187  | 0.1953          | 5.6558 | 9.6074  |
| No log        | 18.0  | 198  | 0.1524          | 6.4656 | 9.6503  |
| No log        | 19.0  | 209  | 0.1226          | 6.9583 | 9.5828  |
| No log        | 20.0  | 220  | 0.0953          | 7.7977 | 9.5951  |
| No log        | 21.0  | 231  | 0.0766          | 7.7172 | 9.638   |
| No log        | 22.0  | 242  | 0.0633          | 8.2632 | 9.6135  |
| No log        | 23.0  | 253  | 0.0581          | 8.3314 | 9.6135  |
| No log        | 24.0  | 264  | 0.0478          | 8.6339 | 9.6564  |
| No log        | 25.0  | 275  | 0.0379          | 8.4599 | 9.681   |
| No log        | 26.0  | 286  | 0.0349          | 8.8518 | 9.681   |
| No log        | 27.0  | 297  | 0.0284          | 8.6561 | 9.6994  |
| No log        | 28.0  | 308  | 0.0215          | 8.8647 | 9.6748  |
| No log        | 29.0  | 319  | 0.0189          | 8.8318 | 9.681   |
| No log        | 30.0  | 330  | 0.0211          | 8.7839 | 9.681   |
| No log        | 31.0  | 341  | 0.0223          | 9.0581 | 9.6687  |
| No log        | 32.0  | 352  | 0.0172          | 9.0431 | 9.6687  |
| No log        | 33.0  | 363  | 0.0131          | 9.0838 | 9.681   |
| No log        | 34.0  | 374  | 0.0152          | 8.9549 | 9.681   |
| No log        | 35.0  | 385  | 0.0121          | 9.0402 | 9.681   |
| No log        | 36.0  | 396  | 0.0178          | 9.1416 | 9.6442  |
| No log        | 37.0  | 407  | 0.0161          | 9.0402 | 9.6564  |
| No log        | 38.0  | 418  | 0.0139          | 9.1518 | 9.681   |
| No log        | 39.0  | 429  | 0.0162          | 9.0741 | 9.681   |
| No log        | 40.0  | 440  | 0.0126          | 9.1518 | 9.681   |
| No log        | 41.0  | 451  | 0.0108          | 9.0897 | 9.681   |
| No log        | 42.0  | 462  | 0.0144          | 9.0836 | 9.6933  |
| No log        | 43.0  | 473  | 0.0238          | 9.1129 | 9.6871  |
| No log        | 44.0  | 484  | 0.0075          | 9.1518 | 9.681   |
| No log        | 45.0  | 495  | 0.0108          | 8.9628 | 9.681   |
| 0.7724        | 46.0  | 506  | 0.0071          | 8.9863 | 9.681   |
| 0.7724        | 47.0  | 517  | 0.0087          | 9.1518 | 9.681   |
| 0.7724        | 48.0  | 528  | 0.0082          | 9.1518 | 9.681   |
| 0.7724        | 49.0  | 539  | 0.0064          | 9.1518 | 9.681   |
| 0.7724        | 50.0  | 550  | 0.0095          | 9.1518 | 9.681   |
| 0.7724        | 51.0  | 561  | 0.0090          | 9.1518 | 9.681   |
| 0.7724        | 52.0  | 572  | 0.0091          | 9.1801 | 9.681   |
| 0.7724        | 53.0  | 583  | 0.0105          | 9.1801 | 9.681   |
| 0.7724        | 54.0  | 594  | 0.0180          | 8.9309 | 9.681   |
| 0.7724        | 55.0  | 605  | 0.0123          | 9.1518 | 9.681   |
| 0.7724        | 56.0  | 616  | 0.0119          | 9.1518 | 9.681   |
| 0.7724        | 57.0  | 627  | 0.0061          | 9.1518 | 9.681   |
| 0.7724        | 58.0  | 638  | 0.0082          | 9.1518 | 9.681   |
| 0.7724        | 59.0  | 649  | 0.0059          | 9.1518 | 9.681   |
| 0.7724        | 60.0  | 660  | 0.0146          | 9.0639 | 9.681   |
| 0.7724        | 61.0  | 671  | 0.0123          | 9.0639 | 9.681   |
| 0.7724        | 62.0  | 682  | 0.0084          | 9.0639 | 9.681   |
| 0.7724        | 63.0  | 693  | 0.0122          | 9.0639 | 9.681   |
| 0.7724        | 64.0  | 704  | 0.0319          | 9.1518 | 9.681   |
| 0.7724        | 65.0  | 715  | 0.0142          | 9.1518 | 9.681   |
| 0.7724        | 66.0  | 726  | 0.0086          | 9.1518 | 9.681   |
| 0.7724        | 67.0  | 737  | 0.0078          | 9.0847 | 9.681   |
| 0.7724        | 68.0  | 748  | 0.0122          | 9.1518 | 9.681   |
| 0.7724        | 69.0  | 759  | 0.0092          | 9.1518 | 9.681   |
| 0.7724        | 70.0  | 770  | 0.0059          | 9.1518 | 9.681   |
| 0.7724        | 71.0  | 781  | 0.0090          | 9.0944 | 9.6871  |
| 0.7724        | 72.0  | 792  | 0.0127          | 9.0944 | 9.6871  |
| 0.7724        | 73.0  | 803  | 0.0108          | 9.0944 | 9.6871  |
| 0.7724        | 74.0  | 814  | 0.0091          | 9.1518 | 9.681   |
| 0.7724        | 75.0  | 825  | 0.0073          | 9.1518 | 9.681   |
| 0.7724        | 76.0  | 836  | 0.0112          | 9.1518 | 9.681   |
| 0.7724        | 77.0  | 847  | 0.0113          | 9.1518 | 9.681   |
| 0.7724        | 78.0  | 858  | 0.0093          | 9.1518 | 9.681   |
| 0.7724        | 79.0  | 869  | 0.0048          | 9.1518 | 9.681   |
| 0.7724        | 80.0  | 880  | 0.0064          | 9.1518 | 9.681   |
| 0.7724        | 81.0  | 891  | 0.0102          | 9.1518 | 9.681   |
| 0.7724        | 82.0  | 902  | 0.0110          | 9.1467 | 9.6748  |
| 0.7724        | 83.0  | 913  | 0.0104          | 9.1467 | 9.6748  |
| 0.7724        | 84.0  | 924  | 0.0089          | 9.1467 | 9.6748  |
| 0.7724        | 85.0  | 935  | 0.0078          | 9.1518 | 9.681   |
| 0.7724        | 86.0  | 946  | 0.0067          | 9.1518 | 9.681   |
| 0.7724        | 87.0  | 957  | 0.0047          | 9.1518 | 9.681   |
| 0.7724        | 88.0  | 968  | 0.0047          | 9.1518 | 9.681   |
| 0.7724        | 89.0  | 979  | 0.0058          | 9.1518 | 9.681   |
| 0.7724        | 90.0  | 990  | 0.0045          | 9.1518 | 9.681   |
| 0.0426        | 91.0  | 1001 | 0.0087          | 9.1518 | 9.681   |
| 0.0426        | 92.0  | 1012 | 0.0096          | 9.1518 | 9.681   |
| 0.0426        | 93.0  | 1023 | 0.0063          | 9.1518 | 9.681   |
| 0.0426        | 94.0  | 1034 | 0.0076          | 9.1518 | 9.681   |
| 0.0426        | 95.0  | 1045 | 0.0055          | 9.1518 | 9.681   |
| 0.0426        | 96.0  | 1056 | 0.0054          | 9.1518 | 9.681   |
| 0.0426        | 97.0  | 1067 | 0.0052          | 9.1518 | 9.681   |
| 0.0426        | 98.0  | 1078 | 0.0046          | 9.1518 | 9.681   |
| 0.0426        | 99.0  | 1089 | 0.0100          | 9.1518 | 9.681   |
| 0.0426        | 100.0 | 1100 | 0.0104          | 9.1518 | 9.681   |
| 0.0426        | 101.0 | 1111 | 0.0180          | 9.1518 | 9.681   |
| 0.0426        | 102.0 | 1122 | 0.0208          | 9.1518 | 9.681   |
| 0.0426        | 103.0 | 1133 | 0.0159          | 9.1518 | 9.681   |
| 0.0426        | 104.0 | 1144 | 0.0139          | 9.1518 | 9.681   |
| 0.0426        | 105.0 | 1155 | 0.0122          | 9.1518 | 9.681   |
| 0.0426        | 106.0 | 1166 | 0.0080          | 9.1518 | 9.681   |
| 0.0426        | 107.0 | 1177 | 0.0063          | 9.1518 | 9.681   |
| 0.0426        | 108.0 | 1188 | 0.0149          | 9.1467 | 9.6687  |
| 0.0426        | 109.0 | 1199 | 0.0147          | 9.1518 | 9.681   |
| 0.0426        | 110.0 | 1210 | 0.0113          | 9.1518 | 9.681   |
| 0.0426        | 111.0 | 1221 | 0.0170          | 9.1518 | 9.681   |
| 0.0426        | 112.0 | 1232 | 0.0138          | 9.1518 | 9.681   |
| 0.0426        | 113.0 | 1243 | 0.0129          | 9.1518 | 9.681   |
| 0.0426        | 114.0 | 1254 | 0.0095          | 9.1518 | 9.681   |
| 0.0426        | 115.0 | 1265 | 0.0133          | 9.1518 | 9.681   |
| 0.0426        | 116.0 | 1276 | 0.0128          | 9.1518 | 9.681   |
| 0.0426        | 117.0 | 1287 | 0.0112          | 9.1518 | 9.681   |
| 0.0426        | 118.0 | 1298 | 0.0093          | 9.1518 | 9.681   |
| 0.0426        | 119.0 | 1309 | 0.0066          | 9.1518 | 9.681   |
| 0.0426        | 120.0 | 1320 | 0.0048          | 9.1518 | 9.681   |
| 0.0426        | 121.0 | 1331 | 0.0079          | 9.1518 | 9.681   |
| 0.0426        | 122.0 | 1342 | 0.0095          | 9.1518 | 9.681   |
| 0.0426        | 123.0 | 1353 | 0.0069          | 9.1518 | 9.681   |
| 0.0426        | 124.0 | 1364 | 0.0056          | 9.1518 | 9.681   |
| 0.0426        | 125.0 | 1375 | 0.0049          | 9.1518 | 9.681   |
| 0.0426        | 126.0 | 1386 | 0.0066          | 9.1518 | 9.681   |
| 0.0426        | 127.0 | 1397 | 0.0080          | 9.1518 | 9.681   |
| 0.0426        | 128.0 | 1408 | 0.0073          | 9.1467 | 9.6687  |
| 0.0426        | 129.0 | 1419 | 0.0063          | 9.1518 | 9.681   |
| 0.0426        | 130.0 | 1430 | 0.0063          | 9.1518 | 9.681   |
| 0.0426        | 131.0 | 1441 | 0.0051          | 9.1518 | 9.681   |
| 0.0426        | 132.0 | 1452 | 0.0045          | 9.1518 | 9.681   |
| 0.0426        | 133.0 | 1463 | 0.0061          | 9.1518 | 9.681   |
| 0.0426        | 134.0 | 1474 | 0.0061          | 9.1518 | 9.681   |
| 0.0426        | 135.0 | 1485 | 0.0042          | 9.1518 | 9.681   |
| 0.0426        | 136.0 | 1496 | 0.0043          | 9.1518 | 9.681   |
| 0.0153        | 137.0 | 1507 | 0.0045          | 9.1518 | 9.681   |
| 0.0153        | 138.0 | 1518 | 0.0056          | 9.1518 | 9.681   |
| 0.0153        | 139.0 | 1529 | 0.0113          | 9.1518 | 9.681   |
| 0.0153        | 140.0 | 1540 | 0.0054          | 9.1518 | 9.681   |
| 0.0153        | 141.0 | 1551 | 0.0054          | 9.1518 | 9.681   |
| 0.0153        | 142.0 | 1562 | 0.0058          | 9.1518 | 9.681   |
| 0.0153        | 143.0 | 1573 | 0.0048          | 9.1518 | 9.681   |
| 0.0153        | 144.0 | 1584 | 0.0049          | 9.1518 | 9.681   |
| 0.0153        | 145.0 | 1595 | 0.0047          | 9.1518 | 9.681   |
| 0.0153        | 146.0 | 1606 | 0.0046          | 9.1518 | 9.681   |
| 0.0153        | 147.0 | 1617 | 0.0046          | 9.1518 | 9.681   |
| 0.0153        | 148.0 | 1628 | 0.0046          | 9.1518 | 9.681   |
| 0.0153        | 149.0 | 1639 | 0.0045          | 9.1518 | 9.681   |
| 0.0153        | 150.0 | 1650 | 0.0048          | 9.1518 | 9.681   |
| 0.0153        | 151.0 | 1661 | 0.0054          | 9.1518 | 9.681   |
| 0.0153        | 152.0 | 1672 | 0.0058          | 9.1518 | 9.681   |
| 0.0153        | 153.0 | 1683 | 0.0057          | 9.1518 | 9.681   |
| 0.0153        | 154.0 | 1694 | 0.0056          | 9.1518 | 9.681   |
| 0.0153        | 155.0 | 1705 | 0.0056          | 9.1518 | 9.681   |
| 0.0153        | 156.0 | 1716 | 0.0061          | 9.1518 | 9.681   |
| 0.0153        | 157.0 | 1727 | 0.0062          | 9.1518 | 9.681   |
| 0.0153        | 158.0 | 1738 | 0.0060          | 9.1518 | 9.681   |
| 0.0153        | 159.0 | 1749 | 0.0060          | 9.1518 | 9.681   |
| 0.0153        | 160.0 | 1760 | 0.0061          | 9.1518 | 9.681   |
| 0.0153        | 161.0 | 1771 | 0.0052          | 9.1518 | 9.681   |
| 0.0153        | 162.0 | 1782 | 0.0049          | 9.1518 | 9.681   |
| 0.0153        | 163.0 | 1793 | 0.0047          | 9.1518 | 9.681   |
| 0.0153        | 164.0 | 1804 | 0.0046          | 9.1518 | 9.681   |
| 0.0153        | 165.0 | 1815 | 0.0045          | 9.1518 | 9.681   |
| 0.0153        | 166.0 | 1826 | 0.0046          | 9.1518 | 9.681   |
| 0.0153        | 167.0 | 1837 | 0.0048          | 9.1518 | 9.681   |
| 0.0153        | 168.0 | 1848 | 0.0052          | 9.1518 | 9.681   |
| 0.0153        | 169.0 | 1859 | 0.0051          | 9.1518 | 9.681   |
| 0.0153        | 170.0 | 1870 | 0.0055          | 9.1518 | 9.681   |
| 0.0153        | 171.0 | 1881 | 0.0056          | 9.1518 | 9.681   |
| 0.0153        | 172.0 | 1892 | 0.0051          | 9.1518 | 9.681   |
| 0.0153        | 173.0 | 1903 | 0.0050          | 9.1518 | 9.681   |
| 0.0153        | 174.0 | 1914 | 0.0048          | 9.1518 | 9.681   |
| 0.0153        | 175.0 | 1925 | 0.0048          | 9.1518 | 9.681   |
| 0.0153        | 176.0 | 1936 | 0.0045          | 9.1518 | 9.681   |
| 0.0153        | 177.0 | 1947 | 0.0045          | 9.1518 | 9.681   |
| 0.0153        | 178.0 | 1958 | 0.0045          | 9.1518 | 9.681   |
| 0.0153        | 179.0 | 1969 | 0.0044          | 9.1518 | 9.681   |
| 0.0153        | 180.0 | 1980 | 0.0046          | 9.1518 | 9.681   |
| 0.0153        | 181.0 | 1991 | 0.0046          | 9.1518 | 9.681   |
| 0.007         | 182.0 | 2002 | 0.0046          | 9.1518 | 9.681   |
| 0.007         | 183.0 | 2013 | 0.0046          | 9.1518 | 9.681   |
| 0.007         | 184.0 | 2024 | 0.0046          | 9.1518 | 9.681   |
| 0.007         | 185.0 | 2035 | 0.0046          | 9.1518 | 9.681   |
| 0.007         | 186.0 | 2046 | 0.0046          | 9.1518 | 9.681   |
| 0.007         | 187.0 | 2057 | 0.0046          | 9.1518 | 9.681   |
| 0.007         | 188.0 | 2068 | 0.0047          | 9.1518 | 9.681   |
| 0.007         | 189.0 | 2079 | 0.0047          | 9.1518 | 9.681   |
| 0.007         | 190.0 | 2090 | 0.0048          | 9.1518 | 9.681   |
| 0.007         | 191.0 | 2101 | 0.0048          | 9.1518 | 9.681   |
| 0.007         | 192.0 | 2112 | 0.0049          | 9.1518 | 9.681   |
| 0.007         | 193.0 | 2123 | 0.0049          | 9.1518 | 9.681   |
| 0.007         | 194.0 | 2134 | 0.0048          | 9.1518 | 9.681   |
| 0.007         | 195.0 | 2145 | 0.0048          | 9.1518 | 9.681   |
| 0.007         | 196.0 | 2156 | 0.0048          | 9.1518 | 9.681   |
| 0.007         | 197.0 | 2167 | 0.0048          | 9.1518 | 9.681   |
| 0.007         | 198.0 | 2178 | 0.0048          | 9.1518 | 9.681   |
| 0.007         | 199.0 | 2189 | 0.0049          | 9.1518 | 9.681   |
| 0.007         | 200.0 | 2200 | 0.0048          | 9.1518 | 9.681   |


### Framework versions

- Transformers 4.35.2
- Pytorch 2.1.0+cu121
- Datasets 2.16.1
- Tokenizers 0.15.0