ParitKansal commited on
Commit
6b4b641
·
verified ·
1 Parent(s): e69a1d2

Upload notebook.ipynb

Browse files

Notebook of how model is trained.

Files changed (1) hide show
  1. notebook.ipynb +2064 -0
notebook.ipynb ADDED
@@ -0,0 +1,2064 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "nbformat": 4,
3
+ "nbformat_minor": 0,
4
+ "metadata": {
5
+ "colab": {
6
+ "provenance": [],
7
+ "gpuType": "T4"
8
+ },
9
+ "kernelspec": {
10
+ "name": "python3",
11
+ "display_name": "Python 3"
12
+ },
13
+ "language_info": {
14
+ "name": "python"
15
+ },
16
+ "accelerator": "GPU",
17
+ "widgets": {
18
+ "application/vnd.jupyter.widget-state+json": {
19
+ "5479c909ae014cb4af686f98dfd896cb": {
20
+ "model_module": "@jupyter-widgets/controls",
21
+ "model_name": "HBoxModel",
22
+ "model_module_version": "1.5.0",
23
+ "state": {
24
+ "_dom_classes": [],
25
+ "_model_module": "@jupyter-widgets/controls",
26
+ "_model_module_version": "1.5.0",
27
+ "_model_name": "HBoxModel",
28
+ "_view_count": null,
29
+ "_view_module": "@jupyter-widgets/controls",
30
+ "_view_module_version": "1.5.0",
31
+ "_view_name": "HBoxView",
32
+ "box_style": "",
33
+ "children": [
34
+ "IPY_MODEL_77cd1a0126ea48f6bc358c52b5b22f3e",
35
+ "IPY_MODEL_452ca86ad5f44319a93031962f10fedf",
36
+ "IPY_MODEL_fa32534d4cf349d2a1efac6496277d2e"
37
+ ],
38
+ "layout": "IPY_MODEL_cb16cc0f8c6847cda083dc0b2f186083"
39
+ }
40
+ },
41
+ "77cd1a0126ea48f6bc358c52b5b22f3e": {
42
+ "model_module": "@jupyter-widgets/controls",
43
+ "model_name": "HTMLModel",
44
+ "model_module_version": "1.5.0",
45
+ "state": {
46
+ "_dom_classes": [],
47
+ "_model_module": "@jupyter-widgets/controls",
48
+ "_model_module_version": "1.5.0",
49
+ "_model_name": "HTMLModel",
50
+ "_view_count": null,
51
+ "_view_module": "@jupyter-widgets/controls",
52
+ "_view_module_version": "1.5.0",
53
+ "_view_name": "HTMLView",
54
+ "description": "",
55
+ "description_tooltip": null,
56
+ "layout": "IPY_MODEL_3b59793ca26745ea8b3780e07447a56c",
57
+ "placeholder": "​",
58
+ "style": "IPY_MODEL_bf13f411d2db420fb25d13d212ba257a",
59
+ "value": "Map: 100%"
60
+ }
61
+ },
62
+ "452ca86ad5f44319a93031962f10fedf": {
63
+ "model_module": "@jupyter-widgets/controls",
64
+ "model_name": "FloatProgressModel",
65
+ "model_module_version": "1.5.0",
66
+ "state": {
67
+ "_dom_classes": [],
68
+ "_model_module": "@jupyter-widgets/controls",
69
+ "_model_module_version": "1.5.0",
70
+ "_model_name": "FloatProgressModel",
71
+ "_view_count": null,
72
+ "_view_module": "@jupyter-widgets/controls",
73
+ "_view_module_version": "1.5.0",
74
+ "_view_name": "ProgressView",
75
+ "bar_style": "success",
76
+ "description": "",
77
+ "description_tooltip": null,
78
+ "layout": "IPY_MODEL_075f0860a2ca452c8b1438aeae8e3d18",
79
+ "max": 119,
80
+ "min": 0,
81
+ "orientation": "horizontal",
82
+ "style": "IPY_MODEL_555d876e65c9419b82633841914086d9",
83
+ "value": 119
84
+ }
85
+ },
86
+ "fa32534d4cf349d2a1efac6496277d2e": {
87
+ "model_module": "@jupyter-widgets/controls",
88
+ "model_name": "HTMLModel",
89
+ "model_module_version": "1.5.0",
90
+ "state": {
91
+ "_dom_classes": [],
92
+ "_model_module": "@jupyter-widgets/controls",
93
+ "_model_module_version": "1.5.0",
94
+ "_model_name": "HTMLModel",
95
+ "_view_count": null,
96
+ "_view_module": "@jupyter-widgets/controls",
97
+ "_view_module_version": "1.5.0",
98
+ "_view_name": "HTMLView",
99
+ "description": "",
100
+ "description_tooltip": null,
101
+ "layout": "IPY_MODEL_e07fb7d992e04473ad7bac9248e7aa75",
102
+ "placeholder": "​",
103
+ "style": "IPY_MODEL_4cf7dbf2cb6d4581a356970de1996ec6",
104
+ "value": " 119/119 [00:00<00:00, 1144.30 examples/s]"
105
+ }
106
+ },
107
+ "cb16cc0f8c6847cda083dc0b2f186083": {
108
+ "model_module": "@jupyter-widgets/base",
109
+ "model_name": "LayoutModel",
110
+ "model_module_version": "1.2.0",
111
+ "state": {
112
+ "_model_module": "@jupyter-widgets/base",
113
+ "_model_module_version": "1.2.0",
114
+ "_model_name": "LayoutModel",
115
+ "_view_count": null,
116
+ "_view_module": "@jupyter-widgets/base",
117
+ "_view_module_version": "1.2.0",
118
+ "_view_name": "LayoutView",
119
+ "align_content": null,
120
+ "align_items": null,
121
+ "align_self": null,
122
+ "border": null,
123
+ "bottom": null,
124
+ "display": null,
125
+ "flex": null,
126
+ "flex_flow": null,
127
+ "grid_area": null,
128
+ "grid_auto_columns": null,
129
+ "grid_auto_flow": null,
130
+ "grid_auto_rows": null,
131
+ "grid_column": null,
132
+ "grid_gap": null,
133
+ "grid_row": null,
134
+ "grid_template_areas": null,
135
+ "grid_template_columns": null,
136
+ "grid_template_rows": null,
137
+ "height": null,
138
+ "justify_content": null,
139
+ "justify_items": null,
140
+ "left": null,
141
+ "margin": null,
142
+ "max_height": null,
143
+ "max_width": null,
144
+ "min_height": null,
145
+ "min_width": null,
146
+ "object_fit": null,
147
+ "object_position": null,
148
+ "order": null,
149
+ "overflow": null,
150
+ "overflow_x": null,
151
+ "overflow_y": null,
152
+ "padding": null,
153
+ "right": null,
154
+ "top": null,
155
+ "visibility": null,
156
+ "width": null
157
+ }
158
+ },
159
+ "3b59793ca26745ea8b3780e07447a56c": {
160
+ "model_module": "@jupyter-widgets/base",
161
+ "model_name": "LayoutModel",
162
+ "model_module_version": "1.2.0",
163
+ "state": {
164
+ "_model_module": "@jupyter-widgets/base",
165
+ "_model_module_version": "1.2.0",
166
+ "_model_name": "LayoutModel",
167
+ "_view_count": null,
168
+ "_view_module": "@jupyter-widgets/base",
169
+ "_view_module_version": "1.2.0",
170
+ "_view_name": "LayoutView",
171
+ "align_content": null,
172
+ "align_items": null,
173
+ "align_self": null,
174
+ "border": null,
175
+ "bottom": null,
176
+ "display": null,
177
+ "flex": null,
178
+ "flex_flow": null,
179
+ "grid_area": null,
180
+ "grid_auto_columns": null,
181
+ "grid_auto_flow": null,
182
+ "grid_auto_rows": null,
183
+ "grid_column": null,
184
+ "grid_gap": null,
185
+ "grid_row": null,
186
+ "grid_template_areas": null,
187
+ "grid_template_columns": null,
188
+ "grid_template_rows": null,
189
+ "height": null,
190
+ "justify_content": null,
191
+ "justify_items": null,
192
+ "left": null,
193
+ "margin": null,
194
+ "max_height": null,
195
+ "max_width": null,
196
+ "min_height": null,
197
+ "min_width": null,
198
+ "object_fit": null,
199
+ "object_position": null,
200
+ "order": null,
201
+ "overflow": null,
202
+ "overflow_x": null,
203
+ "overflow_y": null,
204
+ "padding": null,
205
+ "right": null,
206
+ "top": null,
207
+ "visibility": null,
208
+ "width": null
209
+ }
210
+ },
211
+ "bf13f411d2db420fb25d13d212ba257a": {
212
+ "model_module": "@jupyter-widgets/controls",
213
+ "model_name": "DescriptionStyleModel",
214
+ "model_module_version": "1.5.0",
215
+ "state": {
216
+ "_model_module": "@jupyter-widgets/controls",
217
+ "_model_module_version": "1.5.0",
218
+ "_model_name": "DescriptionStyleModel",
219
+ "_view_count": null,
220
+ "_view_module": "@jupyter-widgets/base",
221
+ "_view_module_version": "1.2.0",
222
+ "_view_name": "StyleView",
223
+ "description_width": ""
224
+ }
225
+ },
226
+ "075f0860a2ca452c8b1438aeae8e3d18": {
227
+ "model_module": "@jupyter-widgets/base",
228
+ "model_name": "LayoutModel",
229
+ "model_module_version": "1.2.0",
230
+ "state": {
231
+ "_model_module": "@jupyter-widgets/base",
232
+ "_model_module_version": "1.2.0",
233
+ "_model_name": "LayoutModel",
234
+ "_view_count": null,
235
+ "_view_module": "@jupyter-widgets/base",
236
+ "_view_module_version": "1.2.0",
237
+ "_view_name": "LayoutView",
238
+ "align_content": null,
239
+ "align_items": null,
240
+ "align_self": null,
241
+ "border": null,
242
+ "bottom": null,
243
+ "display": null,
244
+ "flex": null,
245
+ "flex_flow": null,
246
+ "grid_area": null,
247
+ "grid_auto_columns": null,
248
+ "grid_auto_flow": null,
249
+ "grid_auto_rows": null,
250
+ "grid_column": null,
251
+ "grid_gap": null,
252
+ "grid_row": null,
253
+ "grid_template_areas": null,
254
+ "grid_template_columns": null,
255
+ "grid_template_rows": null,
256
+ "height": null,
257
+ "justify_content": null,
258
+ "justify_items": null,
259
+ "left": null,
260
+ "margin": null,
261
+ "max_height": null,
262
+ "max_width": null,
263
+ "min_height": null,
264
+ "min_width": null,
265
+ "object_fit": null,
266
+ "object_position": null,
267
+ "order": null,
268
+ "overflow": null,
269
+ "overflow_x": null,
270
+ "overflow_y": null,
271
+ "padding": null,
272
+ "right": null,
273
+ "top": null,
274
+ "visibility": null,
275
+ "width": null
276
+ }
277
+ },
278
+ "555d876e65c9419b82633841914086d9": {
279
+ "model_module": "@jupyter-widgets/controls",
280
+ "model_name": "ProgressStyleModel",
281
+ "model_module_version": "1.5.0",
282
+ "state": {
283
+ "_model_module": "@jupyter-widgets/controls",
284
+ "_model_module_version": "1.5.0",
285
+ "_model_name": "ProgressStyleModel",
286
+ "_view_count": null,
287
+ "_view_module": "@jupyter-widgets/base",
288
+ "_view_module_version": "1.2.0",
289
+ "_view_name": "StyleView",
290
+ "bar_color": null,
291
+ "description_width": ""
292
+ }
293
+ },
294
+ "e07fb7d992e04473ad7bac9248e7aa75": {
295
+ "model_module": "@jupyter-widgets/base",
296
+ "model_name": "LayoutModel",
297
+ "model_module_version": "1.2.0",
298
+ "state": {
299
+ "_model_module": "@jupyter-widgets/base",
300
+ "_model_module_version": "1.2.0",
301
+ "_model_name": "LayoutModel",
302
+ "_view_count": null,
303
+ "_view_module": "@jupyter-widgets/base",
304
+ "_view_module_version": "1.2.0",
305
+ "_view_name": "LayoutView",
306
+ "align_content": null,
307
+ "align_items": null,
308
+ "align_self": null,
309
+ "border": null,
310
+ "bottom": null,
311
+ "display": null,
312
+ "flex": null,
313
+ "flex_flow": null,
314
+ "grid_area": null,
315
+ "grid_auto_columns": null,
316
+ "grid_auto_flow": null,
317
+ "grid_auto_rows": null,
318
+ "grid_column": null,
319
+ "grid_gap": null,
320
+ "grid_row": null,
321
+ "grid_template_areas": null,
322
+ "grid_template_columns": null,
323
+ "grid_template_rows": null,
324
+ "height": null,
325
+ "justify_content": null,
326
+ "justify_items": null,
327
+ "left": null,
328
+ "margin": null,
329
+ "max_height": null,
330
+ "max_width": null,
331
+ "min_height": null,
332
+ "min_width": null,
333
+ "object_fit": null,
334
+ "object_position": null,
335
+ "order": null,
336
+ "overflow": null,
337
+ "overflow_x": null,
338
+ "overflow_y": null,
339
+ "padding": null,
340
+ "right": null,
341
+ "top": null,
342
+ "visibility": null,
343
+ "width": null
344
+ }
345
+ },
346
+ "4cf7dbf2cb6d4581a356970de1996ec6": {
347
+ "model_module": "@jupyter-widgets/controls",
348
+ "model_name": "DescriptionStyleModel",
349
+ "model_module_version": "1.5.0",
350
+ "state": {
351
+ "_model_module": "@jupyter-widgets/controls",
352
+ "_model_module_version": "1.5.0",
353
+ "_model_name": "DescriptionStyleModel",
354
+ "_view_count": null,
355
+ "_view_module": "@jupyter-widgets/base",
356
+ "_view_module_version": "1.2.0",
357
+ "_view_name": "StyleView",
358
+ "description_width": ""
359
+ }
360
+ },
361
+ "f13451f32d11428c97243b2bc5b5268c": {
362
+ "model_module": "@jupyter-widgets/controls",
363
+ "model_name": "HBoxModel",
364
+ "model_module_version": "1.5.0",
365
+ "state": {
366
+ "_dom_classes": [],
367
+ "_model_module": "@jupyter-widgets/controls",
368
+ "_model_module_version": "1.5.0",
369
+ "_model_name": "HBoxModel",
370
+ "_view_count": null,
371
+ "_view_module": "@jupyter-widgets/controls",
372
+ "_view_module_version": "1.5.0",
373
+ "_view_name": "HBoxView",
374
+ "box_style": "",
375
+ "children": [
376
+ "IPY_MODEL_02da0b8c8e564c05a91e5dd3b7e1c9cb",
377
+ "IPY_MODEL_43e70aa073504a4496e543fd923b7b0f",
378
+ "IPY_MODEL_7056a43eec3a4718b160360ce9493409"
379
+ ],
380
+ "layout": "IPY_MODEL_348abc5e3efe4fd69aee2329d91c17da"
381
+ }
382
+ },
383
+ "02da0b8c8e564c05a91e5dd3b7e1c9cb": {
384
+ "model_module": "@jupyter-widgets/controls",
385
+ "model_name": "HTMLModel",
386
+ "model_module_version": "1.5.0",
387
+ "state": {
388
+ "_dom_classes": [],
389
+ "_model_module": "@jupyter-widgets/controls",
390
+ "_model_module_version": "1.5.0",
391
+ "_model_name": "HTMLModel",
392
+ "_view_count": null,
393
+ "_view_module": "@jupyter-widgets/controls",
394
+ "_view_module_version": "1.5.0",
395
+ "_view_name": "HTMLView",
396
+ "description": "",
397
+ "description_tooltip": null,
398
+ "layout": "IPY_MODEL_4ededf6d88554f5495f76521f8be98d5",
399
+ "placeholder": "​",
400
+ "style": "IPY_MODEL_46012658060c41e2a2af258e4025499e",
401
+ "value": "model.safetensors: 100%"
402
+ }
403
+ },
404
+ "43e70aa073504a4496e543fd923b7b0f": {
405
+ "model_module": "@jupyter-widgets/controls",
406
+ "model_name": "FloatProgressModel",
407
+ "model_module_version": "1.5.0",
408
+ "state": {
409
+ "_dom_classes": [],
410
+ "_model_module": "@jupyter-widgets/controls",
411
+ "_model_module_version": "1.5.0",
412
+ "_model_name": "FloatProgressModel",
413
+ "_view_count": null,
414
+ "_view_module": "@jupyter-widgets/controls",
415
+ "_view_module_version": "1.5.0",
416
+ "_view_name": "ProgressView",
417
+ "bar_style": "success",
418
+ "description": "",
419
+ "description_tooltip": null,
420
+ "layout": "IPY_MODEL_7c1de009823e492fb9a1648b393b449c",
421
+ "max": 538090408,
422
+ "min": 0,
423
+ "orientation": "horizontal",
424
+ "style": "IPY_MODEL_c4af106fdac144d3b3010ceffc63b4d7",
425
+ "value": 538090408
426
+ }
427
+ },
428
+ "7056a43eec3a4718b160360ce9493409": {
429
+ "model_module": "@jupyter-widgets/controls",
430
+ "model_name": "HTMLModel",
431
+ "model_module_version": "1.5.0",
432
+ "state": {
433
+ "_dom_classes": [],
434
+ "_model_module": "@jupyter-widgets/controls",
435
+ "_model_module_version": "1.5.0",
436
+ "_model_name": "HTMLModel",
437
+ "_view_count": null,
438
+ "_view_module": "@jupyter-widgets/controls",
439
+ "_view_module_version": "1.5.0",
440
+ "_view_name": "HTMLView",
441
+ "description": "",
442
+ "description_tooltip": null,
443
+ "layout": "IPY_MODEL_b9b5a37d22644cacbc3e81a1682c5184",
444
+ "placeholder": "​",
445
+ "style": "IPY_MODEL_390ace0ab34e4727824de62d724767a5",
446
+ "value": " 538M/538M [00:11<00:00, 52.9MB/s]"
447
+ }
448
+ },
449
+ "348abc5e3efe4fd69aee2329d91c17da": {
450
+ "model_module": "@jupyter-widgets/base",
451
+ "model_name": "LayoutModel",
452
+ "model_module_version": "1.2.0",
453
+ "state": {
454
+ "_model_module": "@jupyter-widgets/base",
455
+ "_model_module_version": "1.2.0",
456
+ "_model_name": "LayoutModel",
457
+ "_view_count": null,
458
+ "_view_module": "@jupyter-widgets/base",
459
+ "_view_module_version": "1.2.0",
460
+ "_view_name": "LayoutView",
461
+ "align_content": null,
462
+ "align_items": null,
463
+ "align_self": null,
464
+ "border": null,
465
+ "bottom": null,
466
+ "display": null,
467
+ "flex": null,
468
+ "flex_flow": null,
469
+ "grid_area": null,
470
+ "grid_auto_columns": null,
471
+ "grid_auto_flow": null,
472
+ "grid_auto_rows": null,
473
+ "grid_column": null,
474
+ "grid_gap": null,
475
+ "grid_row": null,
476
+ "grid_template_areas": null,
477
+ "grid_template_columns": null,
478
+ "grid_template_rows": null,
479
+ "height": null,
480
+ "justify_content": null,
481
+ "justify_items": null,
482
+ "left": null,
483
+ "margin": null,
484
+ "max_height": null,
485
+ "max_width": null,
486
+ "min_height": null,
487
+ "min_width": null,
488
+ "object_fit": null,
489
+ "object_position": null,
490
+ "order": null,
491
+ "overflow": null,
492
+ "overflow_x": null,
493
+ "overflow_y": null,
494
+ "padding": null,
495
+ "right": null,
496
+ "top": null,
497
+ "visibility": null,
498
+ "width": null
499
+ }
500
+ },
501
+ "4ededf6d88554f5495f76521f8be98d5": {
502
+ "model_module": "@jupyter-widgets/base",
503
+ "model_name": "LayoutModel",
504
+ "model_module_version": "1.2.0",
505
+ "state": {
506
+ "_model_module": "@jupyter-widgets/base",
507
+ "_model_module_version": "1.2.0",
508
+ "_model_name": "LayoutModel",
509
+ "_view_count": null,
510
+ "_view_module": "@jupyter-widgets/base",
511
+ "_view_module_version": "1.2.0",
512
+ "_view_name": "LayoutView",
513
+ "align_content": null,
514
+ "align_items": null,
515
+ "align_self": null,
516
+ "border": null,
517
+ "bottom": null,
518
+ "display": null,
519
+ "flex": null,
520
+ "flex_flow": null,
521
+ "grid_area": null,
522
+ "grid_auto_columns": null,
523
+ "grid_auto_flow": null,
524
+ "grid_auto_rows": null,
525
+ "grid_column": null,
526
+ "grid_gap": null,
527
+ "grid_row": null,
528
+ "grid_template_areas": null,
529
+ "grid_template_columns": null,
530
+ "grid_template_rows": null,
531
+ "height": null,
532
+ "justify_content": null,
533
+ "justify_items": null,
534
+ "left": null,
535
+ "margin": null,
536
+ "max_height": null,
537
+ "max_width": null,
538
+ "min_height": null,
539
+ "min_width": null,
540
+ "object_fit": null,
541
+ "object_position": null,
542
+ "order": null,
543
+ "overflow": null,
544
+ "overflow_x": null,
545
+ "overflow_y": null,
546
+ "padding": null,
547
+ "right": null,
548
+ "top": null,
549
+ "visibility": null,
550
+ "width": null
551
+ }
552
+ },
553
+ "46012658060c41e2a2af258e4025499e": {
554
+ "model_module": "@jupyter-widgets/controls",
555
+ "model_name": "DescriptionStyleModel",
556
+ "model_module_version": "1.5.0",
557
+ "state": {
558
+ "_model_module": "@jupyter-widgets/controls",
559
+ "_model_module_version": "1.5.0",
560
+ "_model_name": "DescriptionStyleModel",
561
+ "_view_count": null,
562
+ "_view_module": "@jupyter-widgets/base",
563
+ "_view_module_version": "1.2.0",
564
+ "_view_name": "StyleView",
565
+ "description_width": ""
566
+ }
567
+ },
568
+ "7c1de009823e492fb9a1648b393b449c": {
569
+ "model_module": "@jupyter-widgets/base",
570
+ "model_name": "LayoutModel",
571
+ "model_module_version": "1.2.0",
572
+ "state": {
573
+ "_model_module": "@jupyter-widgets/base",
574
+ "_model_module_version": "1.2.0",
575
+ "_model_name": "LayoutModel",
576
+ "_view_count": null,
577
+ "_view_module": "@jupyter-widgets/base",
578
+ "_view_module_version": "1.2.0",
579
+ "_view_name": "LayoutView",
580
+ "align_content": null,
581
+ "align_items": null,
582
+ "align_self": null,
583
+ "border": null,
584
+ "bottom": null,
585
+ "display": null,
586
+ "flex": null,
587
+ "flex_flow": null,
588
+ "grid_area": null,
589
+ "grid_auto_columns": null,
590
+ "grid_auto_flow": null,
591
+ "grid_auto_rows": null,
592
+ "grid_column": null,
593
+ "grid_gap": null,
594
+ "grid_row": null,
595
+ "grid_template_areas": null,
596
+ "grid_template_columns": null,
597
+ "grid_template_rows": null,
598
+ "height": null,
599
+ "justify_content": null,
600
+ "justify_items": null,
601
+ "left": null,
602
+ "margin": null,
603
+ "max_height": null,
604
+ "max_width": null,
605
+ "min_height": null,
606
+ "min_width": null,
607
+ "object_fit": null,
608
+ "object_position": null,
609
+ "order": null,
610
+ "overflow": null,
611
+ "overflow_x": null,
612
+ "overflow_y": null,
613
+ "padding": null,
614
+ "right": null,
615
+ "top": null,
616
+ "visibility": null,
617
+ "width": null
618
+ }
619
+ },
620
+ "c4af106fdac144d3b3010ceffc63b4d7": {
621
+ "model_module": "@jupyter-widgets/controls",
622
+ "model_name": "ProgressStyleModel",
623
+ "model_module_version": "1.5.0",
624
+ "state": {
625
+ "_model_module": "@jupyter-widgets/controls",
626
+ "_model_module_version": "1.5.0",
627
+ "_model_name": "ProgressStyleModel",
628
+ "_view_count": null,
629
+ "_view_module": "@jupyter-widgets/base",
630
+ "_view_module_version": "1.2.0",
631
+ "_view_name": "StyleView",
632
+ "bar_color": null,
633
+ "description_width": ""
634
+ }
635
+ },
636
+ "b9b5a37d22644cacbc3e81a1682c5184": {
637
+ "model_module": "@jupyter-widgets/base",
638
+ "model_name": "LayoutModel",
639
+ "model_module_version": "1.2.0",
640
+ "state": {
641
+ "_model_module": "@jupyter-widgets/base",
642
+ "_model_module_version": "1.2.0",
643
+ "_model_name": "LayoutModel",
644
+ "_view_count": null,
645
+ "_view_module": "@jupyter-widgets/base",
646
+ "_view_module_version": "1.2.0",
647
+ "_view_name": "LayoutView",
648
+ "align_content": null,
649
+ "align_items": null,
650
+ "align_self": null,
651
+ "border": null,
652
+ "bottom": null,
653
+ "display": null,
654
+ "flex": null,
655
+ "flex_flow": null,
656
+ "grid_area": null,
657
+ "grid_auto_columns": null,
658
+ "grid_auto_flow": null,
659
+ "grid_auto_rows": null,
660
+ "grid_column": null,
661
+ "grid_gap": null,
662
+ "grid_row": null,
663
+ "grid_template_areas": null,
664
+ "grid_template_columns": null,
665
+ "grid_template_rows": null,
666
+ "height": null,
667
+ "justify_content": null,
668
+ "justify_items": null,
669
+ "left": null,
670
+ "margin": null,
671
+ "max_height": null,
672
+ "max_width": null,
673
+ "min_height": null,
674
+ "min_width": null,
675
+ "object_fit": null,
676
+ "object_position": null,
677
+ "order": null,
678
+ "overflow": null,
679
+ "overflow_x": null,
680
+ "overflow_y": null,
681
+ "padding": null,
682
+ "right": null,
683
+ "top": null,
684
+ "visibility": null,
685
+ "width": null
686
+ }
687
+ },
688
+ "390ace0ab34e4727824de62d724767a5": {
689
+ "model_module": "@jupyter-widgets/controls",
690
+ "model_name": "DescriptionStyleModel",
691
+ "model_module_version": "1.5.0",
692
+ "state": {
693
+ "_model_module": "@jupyter-widgets/controls",
694
+ "_model_module_version": "1.5.0",
695
+ "_model_name": "DescriptionStyleModel",
696
+ "_view_count": null,
697
+ "_view_module": "@jupyter-widgets/base",
698
+ "_view_module_version": "1.2.0",
699
+ "_view_name": "StyleView",
700
+ "description_width": ""
701
+ }
702
+ },
703
+ "d2eaa596146742e4937f7b1ec7320dbb": {
704
+ "model_module": "@jupyter-widgets/controls",
705
+ "model_name": "HBoxModel",
706
+ "model_module_version": "1.5.0",
707
+ "state": {
708
+ "_dom_classes": [],
709
+ "_model_module": "@jupyter-widgets/controls",
710
+ "_model_module_version": "1.5.0",
711
+ "_model_name": "HBoxModel",
712
+ "_view_count": null,
713
+ "_view_module": "@jupyter-widgets/controls",
714
+ "_view_module_version": "1.5.0",
715
+ "_view_name": "HBoxView",
716
+ "box_style": "",
717
+ "children": [
718
+ "IPY_MODEL_34f9fa16d57247829d00de248eb0e1ae",
719
+ "IPY_MODEL_d40e8b6c471e4f4da2f7521955155955",
720
+ "IPY_MODEL_82fd1f3b07f24c4c8a89454477ab6b97"
721
+ ],
722
+ "layout": "IPY_MODEL_b5b509b3f7bc4ad4b994c68317a07228"
723
+ }
724
+ },
725
+ "34f9fa16d57247829d00de248eb0e1ae": {
726
+ "model_module": "@jupyter-widgets/controls",
727
+ "model_name": "HTMLModel",
728
+ "model_module_version": "1.5.0",
729
+ "state": {
730
+ "_dom_classes": [],
731
+ "_model_module": "@jupyter-widgets/controls",
732
+ "_model_module_version": "1.5.0",
733
+ "_model_name": "HTMLModel",
734
+ "_view_count": null,
735
+ "_view_module": "@jupyter-widgets/controls",
736
+ "_view_module_version": "1.5.0",
737
+ "_view_name": "HTMLView",
738
+ "description": "",
739
+ "description_tooltip": null,
740
+ "layout": "IPY_MODEL_e27f78d5d126483cb7be7c9e69913954",
741
+ "placeholder": "​",
742
+ "style": "IPY_MODEL_9b2a21eb504d49df8cbfd2d6bff0ddbf",
743
+ "value": "Upload 2 LFS files: 100%"
744
+ }
745
+ },
746
+ "d40e8b6c471e4f4da2f7521955155955": {
747
+ "model_module": "@jupyter-widgets/controls",
748
+ "model_name": "FloatProgressModel",
749
+ "model_module_version": "1.5.0",
750
+ "state": {
751
+ "_dom_classes": [],
752
+ "_model_module": "@jupyter-widgets/controls",
753
+ "_model_module_version": "1.5.0",
754
+ "_model_name": "FloatProgressModel",
755
+ "_view_count": null,
756
+ "_view_module": "@jupyter-widgets/controls",
757
+ "_view_module_version": "1.5.0",
758
+ "_view_name": "ProgressView",
759
+ "bar_style": "success",
760
+ "description": "",
761
+ "description_tooltip": null,
762
+ "layout": "IPY_MODEL_43985ed44fc14c0dad5cf7cff9a47d40",
763
+ "max": 2,
764
+ "min": 0,
765
+ "orientation": "horizontal",
766
+ "style": "IPY_MODEL_59115af19cbe4420b6b1c12bf406f6ed",
767
+ "value": 2
768
+ }
769
+ },
770
+ "82fd1f3b07f24c4c8a89454477ab6b97": {
771
+ "model_module": "@jupyter-widgets/controls",
772
+ "model_name": "HTMLModel",
773
+ "model_module_version": "1.5.0",
774
+ "state": {
775
+ "_dom_classes": [],
776
+ "_model_module": "@jupyter-widgets/controls",
777
+ "_model_module_version": "1.5.0",
778
+ "_model_name": "HTMLModel",
779
+ "_view_count": null,
780
+ "_view_module": "@jupyter-widgets/controls",
781
+ "_view_module_version": "1.5.0",
782
+ "_view_name": "HTMLView",
783
+ "description": "",
784
+ "description_tooltip": null,
785
+ "layout": "IPY_MODEL_e5b320112ff346529e6a936eab29d95d",
786
+ "placeholder": "​",
787
+ "style": "IPY_MODEL_4a78c564e91c4f048778f65e4905b22a",
788
+ "value": " 2/2 [00:11<00:00, 11.77s/it]"
789
+ }
790
+ },
791
+ "b5b509b3f7bc4ad4b994c68317a07228": {
792
+ "model_module": "@jupyter-widgets/base",
793
+ "model_name": "LayoutModel",
794
+ "model_module_version": "1.2.0",
795
+ "state": {
796
+ "_model_module": "@jupyter-widgets/base",
797
+ "_model_module_version": "1.2.0",
798
+ "_model_name": "LayoutModel",
799
+ "_view_count": null,
800
+ "_view_module": "@jupyter-widgets/base",
801
+ "_view_module_version": "1.2.0",
802
+ "_view_name": "LayoutView",
803
+ "align_content": null,
804
+ "align_items": null,
805
+ "align_self": null,
806
+ "border": null,
807
+ "bottom": null,
808
+ "display": null,
809
+ "flex": null,
810
+ "flex_flow": null,
811
+ "grid_area": null,
812
+ "grid_auto_columns": null,
813
+ "grid_auto_flow": null,
814
+ "grid_auto_rows": null,
815
+ "grid_column": null,
816
+ "grid_gap": null,
817
+ "grid_row": null,
818
+ "grid_template_areas": null,
819
+ "grid_template_columns": null,
820
+ "grid_template_rows": null,
821
+ "height": null,
822
+ "justify_content": null,
823
+ "justify_items": null,
824
+ "left": null,
825
+ "margin": null,
826
+ "max_height": null,
827
+ "max_width": null,
828
+ "min_height": null,
829
+ "min_width": null,
830
+ "object_fit": null,
831
+ "object_position": null,
832
+ "order": null,
833
+ "overflow": null,
834
+ "overflow_x": null,
835
+ "overflow_y": null,
836
+ "padding": null,
837
+ "right": null,
838
+ "top": null,
839
+ "visibility": null,
840
+ "width": null
841
+ }
842
+ },
843
+ "e27f78d5d126483cb7be7c9e69913954": {
844
+ "model_module": "@jupyter-widgets/base",
845
+ "model_name": "LayoutModel",
846
+ "model_module_version": "1.2.0",
847
+ "state": {
848
+ "_model_module": "@jupyter-widgets/base",
849
+ "_model_module_version": "1.2.0",
850
+ "_model_name": "LayoutModel",
851
+ "_view_count": null,
852
+ "_view_module": "@jupyter-widgets/base",
853
+ "_view_module_version": "1.2.0",
854
+ "_view_name": "LayoutView",
855
+ "align_content": null,
856
+ "align_items": null,
857
+ "align_self": null,
858
+ "border": null,
859
+ "bottom": null,
860
+ "display": null,
861
+ "flex": null,
862
+ "flex_flow": null,
863
+ "grid_area": null,
864
+ "grid_auto_columns": null,
865
+ "grid_auto_flow": null,
866
+ "grid_auto_rows": null,
867
+ "grid_column": null,
868
+ "grid_gap": null,
869
+ "grid_row": null,
870
+ "grid_template_areas": null,
871
+ "grid_template_columns": null,
872
+ "grid_template_rows": null,
873
+ "height": null,
874
+ "justify_content": null,
875
+ "justify_items": null,
876
+ "left": null,
877
+ "margin": null,
878
+ "max_height": null,
879
+ "max_width": null,
880
+ "min_height": null,
881
+ "min_width": null,
882
+ "object_fit": null,
883
+ "object_position": null,
884
+ "order": null,
885
+ "overflow": null,
886
+ "overflow_x": null,
887
+ "overflow_y": null,
888
+ "padding": null,
889
+ "right": null,
890
+ "top": null,
891
+ "visibility": null,
892
+ "width": null
893
+ }
894
+ },
895
+ "9b2a21eb504d49df8cbfd2d6bff0ddbf": {
896
+ "model_module": "@jupyter-widgets/controls",
897
+ "model_name": "DescriptionStyleModel",
898
+ "model_module_version": "1.5.0",
899
+ "state": {
900
+ "_model_module": "@jupyter-widgets/controls",
901
+ "_model_module_version": "1.5.0",
902
+ "_model_name": "DescriptionStyleModel",
903
+ "_view_count": null,
904
+ "_view_module": "@jupyter-widgets/base",
905
+ "_view_module_version": "1.2.0",
906
+ "_view_name": "StyleView",
907
+ "description_width": ""
908
+ }
909
+ },
910
+ "43985ed44fc14c0dad5cf7cff9a47d40": {
911
+ "model_module": "@jupyter-widgets/base",
912
+ "model_name": "LayoutModel",
913
+ "model_module_version": "1.2.0",
914
+ "state": {
915
+ "_model_module": "@jupyter-widgets/base",
916
+ "_model_module_version": "1.2.0",
917
+ "_model_name": "LayoutModel",
918
+ "_view_count": null,
919
+ "_view_module": "@jupyter-widgets/base",
920
+ "_view_module_version": "1.2.0",
921
+ "_view_name": "LayoutView",
922
+ "align_content": null,
923
+ "align_items": null,
924
+ "align_self": null,
925
+ "border": null,
926
+ "bottom": null,
927
+ "display": null,
928
+ "flex": null,
929
+ "flex_flow": null,
930
+ "grid_area": null,
931
+ "grid_auto_columns": null,
932
+ "grid_auto_flow": null,
933
+ "grid_auto_rows": null,
934
+ "grid_column": null,
935
+ "grid_gap": null,
936
+ "grid_row": null,
937
+ "grid_template_areas": null,
938
+ "grid_template_columns": null,
939
+ "grid_template_rows": null,
940
+ "height": null,
941
+ "justify_content": null,
942
+ "justify_items": null,
943
+ "left": null,
944
+ "margin": null,
945
+ "max_height": null,
946
+ "max_width": null,
947
+ "min_height": null,
948
+ "min_width": null,
949
+ "object_fit": null,
950
+ "object_position": null,
951
+ "order": null,
952
+ "overflow": null,
953
+ "overflow_x": null,
954
+ "overflow_y": null,
955
+ "padding": null,
956
+ "right": null,
957
+ "top": null,
958
+ "visibility": null,
959
+ "width": null
960
+ }
961
+ },
962
+ "59115af19cbe4420b6b1c12bf406f6ed": {
963
+ "model_module": "@jupyter-widgets/controls",
964
+ "model_name": "ProgressStyleModel",
965
+ "model_module_version": "1.5.0",
966
+ "state": {
967
+ "_model_module": "@jupyter-widgets/controls",
968
+ "_model_module_version": "1.5.0",
969
+ "_model_name": "ProgressStyleModel",
970
+ "_view_count": null,
971
+ "_view_module": "@jupyter-widgets/base",
972
+ "_view_module_version": "1.2.0",
973
+ "_view_name": "StyleView",
974
+ "bar_color": null,
975
+ "description_width": ""
976
+ }
977
+ },
978
+ "e5b320112ff346529e6a936eab29d95d": {
979
+ "model_module": "@jupyter-widgets/base",
980
+ "model_name": "LayoutModel",
981
+ "model_module_version": "1.2.0",
982
+ "state": {
983
+ "_model_module": "@jupyter-widgets/base",
984
+ "_model_module_version": "1.2.0",
985
+ "_model_name": "LayoutModel",
986
+ "_view_count": null,
987
+ "_view_module": "@jupyter-widgets/base",
988
+ "_view_module_version": "1.2.0",
989
+ "_view_name": "LayoutView",
990
+ "align_content": null,
991
+ "align_items": null,
992
+ "align_self": null,
993
+ "border": null,
994
+ "bottom": null,
995
+ "display": null,
996
+ "flex": null,
997
+ "flex_flow": null,
998
+ "grid_area": null,
999
+ "grid_auto_columns": null,
1000
+ "grid_auto_flow": null,
1001
+ "grid_auto_rows": null,
1002
+ "grid_column": null,
1003
+ "grid_gap": null,
1004
+ "grid_row": null,
1005
+ "grid_template_areas": null,
1006
+ "grid_template_columns": null,
1007
+ "grid_template_rows": null,
1008
+ "height": null,
1009
+ "justify_content": null,
1010
+ "justify_items": null,
1011
+ "left": null,
1012
+ "margin": null,
1013
+ "max_height": null,
1014
+ "max_width": null,
1015
+ "min_height": null,
1016
+ "min_width": null,
1017
+ "object_fit": null,
1018
+ "object_position": null,
1019
+ "order": null,
1020
+ "overflow": null,
1021
+ "overflow_x": null,
1022
+ "overflow_y": null,
1023
+ "padding": null,
1024
+ "right": null,
1025
+ "top": null,
1026
+ "visibility": null,
1027
+ "width": null
1028
+ }
1029
+ },
1030
+ "4a78c564e91c4f048778f65e4905b22a": {
1031
+ "model_module": "@jupyter-widgets/controls",
1032
+ "model_name": "DescriptionStyleModel",
1033
+ "model_module_version": "1.5.0",
1034
+ "state": {
1035
+ "_model_module": "@jupyter-widgets/controls",
1036
+ "_model_module_version": "1.5.0",
1037
+ "_model_name": "DescriptionStyleModel",
1038
+ "_view_count": null,
1039
+ "_view_module": "@jupyter-widgets/base",
1040
+ "_view_module_version": "1.2.0",
1041
+ "_view_name": "StyleView",
1042
+ "description_width": ""
1043
+ }
1044
+ },
1045
+ "86b0050f0417457784489f96c1313b4a": {
1046
+ "model_module": "@jupyter-widgets/controls",
1047
+ "model_name": "HBoxModel",
1048
+ "model_module_version": "1.5.0",
1049
+ "state": {
1050
+ "_dom_classes": [],
1051
+ "_model_module": "@jupyter-widgets/controls",
1052
+ "_model_module_version": "1.5.0",
1053
+ "_model_name": "HBoxModel",
1054
+ "_view_count": null,
1055
+ "_view_module": "@jupyter-widgets/controls",
1056
+ "_view_module_version": "1.5.0",
1057
+ "_view_name": "HBoxView",
1058
+ "box_style": "",
1059
+ "children": [
1060
+ "IPY_MODEL_9196c063ec4741fcbc40b3c77eadc81e",
1061
+ "IPY_MODEL_b10f037ce64143cbbf713148a46c2a0d",
1062
+ "IPY_MODEL_338fcb1a304341759ce1d53093b48c8e"
1063
+ ],
1064
+ "layout": "IPY_MODEL_461f8d72aa934cb499b8c09031682a6d"
1065
+ }
1066
+ },
1067
+ "9196c063ec4741fcbc40b3c77eadc81e": {
1068
+ "model_module": "@jupyter-widgets/controls",
1069
+ "model_name": "HTMLModel",
1070
+ "model_module_version": "1.5.0",
1071
+ "state": {
1072
+ "_dom_classes": [],
1073
+ "_model_module": "@jupyter-widgets/controls",
1074
+ "_model_module_version": "1.5.0",
1075
+ "_model_name": "HTMLModel",
1076
+ "_view_count": null,
1077
+ "_view_module": "@jupyter-widgets/controls",
1078
+ "_view_module_version": "1.5.0",
1079
+ "_view_name": "HTMLView",
1080
+ "description": "",
1081
+ "description_tooltip": null,
1082
+ "layout": "IPY_MODEL_82e726ec9ba24dd79c3b02388634d08d",
1083
+ "placeholder": "​",
1084
+ "style": "IPY_MODEL_9a4d24decc8e4bf0a066bbd5a968c3b2",
1085
+ "value": "training_args.bin: 100%"
1086
+ }
1087
+ },
1088
+ "b10f037ce64143cbbf713148a46c2a0d": {
1089
+ "model_module": "@jupyter-widgets/controls",
1090
+ "model_name": "FloatProgressModel",
1091
+ "model_module_version": "1.5.0",
1092
+ "state": {
1093
+ "_dom_classes": [],
1094
+ "_model_module": "@jupyter-widgets/controls",
1095
+ "_model_module_version": "1.5.0",
1096
+ "_model_name": "FloatProgressModel",
1097
+ "_view_count": null,
1098
+ "_view_module": "@jupyter-widgets/controls",
1099
+ "_view_module_version": "1.5.0",
1100
+ "_view_name": "ProgressView",
1101
+ "bar_style": "success",
1102
+ "description": "",
1103
+ "description_tooltip": null,
1104
+ "layout": "IPY_MODEL_9c08523e49844618953ab8dd28b9f6fd",
1105
+ "max": 5624,
1106
+ "min": 0,
1107
+ "orientation": "horizontal",
1108
+ "style": "IPY_MODEL_d578d568fd4f4ac6aba0b5de3d1a2c1d",
1109
+ "value": 5624
1110
+ }
1111
+ },
1112
+ "338fcb1a304341759ce1d53093b48c8e": {
1113
+ "model_module": "@jupyter-widgets/controls",
1114
+ "model_name": "HTMLModel",
1115
+ "model_module_version": "1.5.0",
1116
+ "state": {
1117
+ "_dom_classes": [],
1118
+ "_model_module": "@jupyter-widgets/controls",
1119
+ "_model_module_version": "1.5.0",
1120
+ "_model_name": "HTMLModel",
1121
+ "_view_count": null,
1122
+ "_view_module": "@jupyter-widgets/controls",
1123
+ "_view_module_version": "1.5.0",
1124
+ "_view_name": "HTMLView",
1125
+ "description": "",
1126
+ "description_tooltip": null,
1127
+ "layout": "IPY_MODEL_60f3ec480a6549908acc753e67b9c3f4",
1128
+ "placeholder": "​",
1129
+ "style": "IPY_MODEL_67cd4b245e02485b82f422c187cec1ae",
1130
+ "value": " 5.62k/5.62k [00:00<00:00, 29.0kB/s]"
1131
+ }
1132
+ },
1133
+ "461f8d72aa934cb499b8c09031682a6d": {
1134
+ "model_module": "@jupyter-widgets/base",
1135
+ "model_name": "LayoutModel",
1136
+ "model_module_version": "1.2.0",
1137
+ "state": {
1138
+ "_model_module": "@jupyter-widgets/base",
1139
+ "_model_module_version": "1.2.0",
1140
+ "_model_name": "LayoutModel",
1141
+ "_view_count": null,
1142
+ "_view_module": "@jupyter-widgets/base",
1143
+ "_view_module_version": "1.2.0",
1144
+ "_view_name": "LayoutView",
1145
+ "align_content": null,
1146
+ "align_items": null,
1147
+ "align_self": null,
1148
+ "border": null,
1149
+ "bottom": null,
1150
+ "display": null,
1151
+ "flex": null,
1152
+ "flex_flow": null,
1153
+ "grid_area": null,
1154
+ "grid_auto_columns": null,
1155
+ "grid_auto_flow": null,
1156
+ "grid_auto_rows": null,
1157
+ "grid_column": null,
1158
+ "grid_gap": null,
1159
+ "grid_row": null,
1160
+ "grid_template_areas": null,
1161
+ "grid_template_columns": null,
1162
+ "grid_template_rows": null,
1163
+ "height": null,
1164
+ "justify_content": null,
1165
+ "justify_items": null,
1166
+ "left": null,
1167
+ "margin": null,
1168
+ "max_height": null,
1169
+ "max_width": null,
1170
+ "min_height": null,
1171
+ "min_width": null,
1172
+ "object_fit": null,
1173
+ "object_position": null,
1174
+ "order": null,
1175
+ "overflow": null,
1176
+ "overflow_x": null,
1177
+ "overflow_y": null,
1178
+ "padding": null,
1179
+ "right": null,
1180
+ "top": null,
1181
+ "visibility": null,
1182
+ "width": null
1183
+ }
1184
+ },
1185
+ "82e726ec9ba24dd79c3b02388634d08d": {
1186
+ "model_module": "@jupyter-widgets/base",
1187
+ "model_name": "LayoutModel",
1188
+ "model_module_version": "1.2.0",
1189
+ "state": {
1190
+ "_model_module": "@jupyter-widgets/base",
1191
+ "_model_module_version": "1.2.0",
1192
+ "_model_name": "LayoutModel",
1193
+ "_view_count": null,
1194
+ "_view_module": "@jupyter-widgets/base",
1195
+ "_view_module_version": "1.2.0",
1196
+ "_view_name": "LayoutView",
1197
+ "align_content": null,
1198
+ "align_items": null,
1199
+ "align_self": null,
1200
+ "border": null,
1201
+ "bottom": null,
1202
+ "display": null,
1203
+ "flex": null,
1204
+ "flex_flow": null,
1205
+ "grid_area": null,
1206
+ "grid_auto_columns": null,
1207
+ "grid_auto_flow": null,
1208
+ "grid_auto_rows": null,
1209
+ "grid_column": null,
1210
+ "grid_gap": null,
1211
+ "grid_row": null,
1212
+ "grid_template_areas": null,
1213
+ "grid_template_columns": null,
1214
+ "grid_template_rows": null,
1215
+ "height": null,
1216
+ "justify_content": null,
1217
+ "justify_items": null,
1218
+ "left": null,
1219
+ "margin": null,
1220
+ "max_height": null,
1221
+ "max_width": null,
1222
+ "min_height": null,
1223
+ "min_width": null,
1224
+ "object_fit": null,
1225
+ "object_position": null,
1226
+ "order": null,
1227
+ "overflow": null,
1228
+ "overflow_x": null,
1229
+ "overflow_y": null,
1230
+ "padding": null,
1231
+ "right": null,
1232
+ "top": null,
1233
+ "visibility": null,
1234
+ "width": null
1235
+ }
1236
+ },
1237
+ "9a4d24decc8e4bf0a066bbd5a968c3b2": {
1238
+ "model_module": "@jupyter-widgets/controls",
1239
+ "model_name": "DescriptionStyleModel",
1240
+ "model_module_version": "1.5.0",
1241
+ "state": {
1242
+ "_model_module": "@jupyter-widgets/controls",
1243
+ "_model_module_version": "1.5.0",
1244
+ "_model_name": "DescriptionStyleModel",
1245
+ "_view_count": null,
1246
+ "_view_module": "@jupyter-widgets/base",
1247
+ "_view_module_version": "1.2.0",
1248
+ "_view_name": "StyleView",
1249
+ "description_width": ""
1250
+ }
1251
+ },
1252
+ "9c08523e49844618953ab8dd28b9f6fd": {
1253
+ "model_module": "@jupyter-widgets/base",
1254
+ "model_name": "LayoutModel",
1255
+ "model_module_version": "1.2.0",
1256
+ "state": {
1257
+ "_model_module": "@jupyter-widgets/base",
1258
+ "_model_module_version": "1.2.0",
1259
+ "_model_name": "LayoutModel",
1260
+ "_view_count": null,
1261
+ "_view_module": "@jupyter-widgets/base",
1262
+ "_view_module_version": "1.2.0",
1263
+ "_view_name": "LayoutView",
1264
+ "align_content": null,
1265
+ "align_items": null,
1266
+ "align_self": null,
1267
+ "border": null,
1268
+ "bottom": null,
1269
+ "display": null,
1270
+ "flex": null,
1271
+ "flex_flow": null,
1272
+ "grid_area": null,
1273
+ "grid_auto_columns": null,
1274
+ "grid_auto_flow": null,
1275
+ "grid_auto_rows": null,
1276
+ "grid_column": null,
1277
+ "grid_gap": null,
1278
+ "grid_row": null,
1279
+ "grid_template_areas": null,
1280
+ "grid_template_columns": null,
1281
+ "grid_template_rows": null,
1282
+ "height": null,
1283
+ "justify_content": null,
1284
+ "justify_items": null,
1285
+ "left": null,
1286
+ "margin": null,
1287
+ "max_height": null,
1288
+ "max_width": null,
1289
+ "min_height": null,
1290
+ "min_width": null,
1291
+ "object_fit": null,
1292
+ "object_position": null,
1293
+ "order": null,
1294
+ "overflow": null,
1295
+ "overflow_x": null,
1296
+ "overflow_y": null,
1297
+ "padding": null,
1298
+ "right": null,
1299
+ "top": null,
1300
+ "visibility": null,
1301
+ "width": null
1302
+ }
1303
+ },
1304
+ "d578d568fd4f4ac6aba0b5de3d1a2c1d": {
1305
+ "model_module": "@jupyter-widgets/controls",
1306
+ "model_name": "ProgressStyleModel",
1307
+ "model_module_version": "1.5.0",
1308
+ "state": {
1309
+ "_model_module": "@jupyter-widgets/controls",
1310
+ "_model_module_version": "1.5.0",
1311
+ "_model_name": "ProgressStyleModel",
1312
+ "_view_count": null,
1313
+ "_view_module": "@jupyter-widgets/base",
1314
+ "_view_module_version": "1.2.0",
1315
+ "_view_name": "StyleView",
1316
+ "bar_color": null,
1317
+ "description_width": ""
1318
+ }
1319
+ },
1320
+ "60f3ec480a6549908acc753e67b9c3f4": {
1321
+ "model_module": "@jupyter-widgets/base",
1322
+ "model_name": "LayoutModel",
1323
+ "model_module_version": "1.2.0",
1324
+ "state": {
1325
+ "_model_module": "@jupyter-widgets/base",
1326
+ "_model_module_version": "1.2.0",
1327
+ "_model_name": "LayoutModel",
1328
+ "_view_count": null,
1329
+ "_view_module": "@jupyter-widgets/base",
1330
+ "_view_module_version": "1.2.0",
1331
+ "_view_name": "LayoutView",
1332
+ "align_content": null,
1333
+ "align_items": null,
1334
+ "align_self": null,
1335
+ "border": null,
1336
+ "bottom": null,
1337
+ "display": null,
1338
+ "flex": null,
1339
+ "flex_flow": null,
1340
+ "grid_area": null,
1341
+ "grid_auto_columns": null,
1342
+ "grid_auto_flow": null,
1343
+ "grid_auto_rows": null,
1344
+ "grid_column": null,
1345
+ "grid_gap": null,
1346
+ "grid_row": null,
1347
+ "grid_template_areas": null,
1348
+ "grid_template_columns": null,
1349
+ "grid_template_rows": null,
1350
+ "height": null,
1351
+ "justify_content": null,
1352
+ "justify_items": null,
1353
+ "left": null,
1354
+ "margin": null,
1355
+ "max_height": null,
1356
+ "max_width": null,
1357
+ "min_height": null,
1358
+ "min_width": null,
1359
+ "object_fit": null,
1360
+ "object_position": null,
1361
+ "order": null,
1362
+ "overflow": null,
1363
+ "overflow_x": null,
1364
+ "overflow_y": null,
1365
+ "padding": null,
1366
+ "right": null,
1367
+ "top": null,
1368
+ "visibility": null,
1369
+ "width": null
1370
+ }
1371
+ },
1372
+ "67cd4b245e02485b82f422c187cec1ae": {
1373
+ "model_module": "@jupyter-widgets/controls",
1374
+ "model_name": "DescriptionStyleModel",
1375
+ "model_module_version": "1.5.0",
1376
+ "state": {
1377
+ "_model_module": "@jupyter-widgets/controls",
1378
+ "_model_module_version": "1.5.0",
1379
+ "_model_name": "DescriptionStyleModel",
1380
+ "_view_count": null,
1381
+ "_view_module": "@jupyter-widgets/base",
1382
+ "_view_module_version": "1.2.0",
1383
+ "_view_name": "StyleView",
1384
+ "description_width": ""
1385
+ }
1386
+ }
1387
+ }
1388
+ }
1389
+ },
1390
+ "cells": [
1391
+ {
1392
+ "cell_type": "code",
1393
+ "execution_count": 1,
1394
+ "metadata": {
1395
+ "colab": {
1396
+ "base_uri": "https://localhost:8080/"
1397
+ },
1398
+ "id": "3W3Z2pWzCxpq",
1399
+ "outputId": "e8a3235d-3433-4882-ad07-ef438ee4b704"
1400
+ },
1401
+ "outputs": [
1402
+ {
1403
+ "output_type": "stream",
1404
+ "name": "stdout",
1405
+ "text": [
1406
+ "Requirement already satisfied: transformers in /usr/local/lib/python3.10/dist-packages (4.47.1)\n",
1407
+ "Collecting datasets\n",
1408
+ " Downloading datasets-3.2.0-py3-none-any.whl.metadata (20 kB)\n",
1409
+ "Collecting trl\n",
1410
+ " Downloading trl-0.13.0-py3-none-any.whl.metadata (11 kB)\n",
1411
+ "Requirement already satisfied: huggingface_hub in /usr/local/lib/python3.10/dist-packages (0.27.1)\n",
1412
+ "Requirement already satisfied: filelock in /usr/local/lib/python3.10/dist-packages (from transformers) (3.16.1)\n",
1413
+ "Requirement already satisfied: numpy>=1.17 in /usr/local/lib/python3.10/dist-packages (from transformers) (1.26.4)\n",
1414
+ "Requirement already satisfied: packaging>=20.0 in /usr/local/lib/python3.10/dist-packages (from transformers) (24.2)\n",
1415
+ "Requirement already satisfied: pyyaml>=5.1 in /usr/local/lib/python3.10/dist-packages (from transformers) (6.0.2)\n",
1416
+ "Requirement already satisfied: regex!=2019.12.17 in /usr/local/lib/python3.10/dist-packages (from transformers) (2024.11.6)\n",
1417
+ "Requirement already satisfied: requests in /usr/local/lib/python3.10/dist-packages (from transformers) (2.32.3)\n",
1418
+ "Requirement already satisfied: tokenizers<0.22,>=0.21 in /usr/local/lib/python3.10/dist-packages (from transformers) (0.21.0)\n",
1419
+ "Requirement already satisfied: safetensors>=0.4.1 in /usr/local/lib/python3.10/dist-packages (from transformers) (0.5.1)\n",
1420
+ "Requirement already satisfied: tqdm>=4.27 in /usr/local/lib/python3.10/dist-packages (from transformers) (4.67.1)\n",
1421
+ "Requirement already satisfied: pyarrow>=15.0.0 in /usr/local/lib/python3.10/dist-packages (from datasets) (17.0.0)\n",
1422
+ "Collecting dill<0.3.9,>=0.3.0 (from datasets)\n",
1423
+ " Downloading dill-0.3.8-py3-none-any.whl.metadata (10 kB)\n",
1424
+ "Requirement already satisfied: pandas in /usr/local/lib/python3.10/dist-packages (from datasets) (2.2.2)\n",
1425
+ "Collecting xxhash (from datasets)\n",
1426
+ " Downloading xxhash-3.5.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (12 kB)\n",
1427
+ "Collecting multiprocess<0.70.17 (from datasets)\n",
1428
+ " Downloading multiprocess-0.70.16-py310-none-any.whl.metadata (7.2 kB)\n",
1429
+ "Collecting fsspec<=2024.9.0,>=2023.1.0 (from fsspec[http]<=2024.9.0,>=2023.1.0->datasets)\n",
1430
+ " Downloading fsspec-2024.9.0-py3-none-any.whl.metadata (11 kB)\n",
1431
+ "Requirement already satisfied: aiohttp in /usr/local/lib/python3.10/dist-packages (from datasets) (3.11.11)\n",
1432
+ "Requirement already satisfied: accelerate>=0.34.0 in /usr/local/lib/python3.10/dist-packages (from trl) (1.2.1)\n",
1433
+ "Requirement already satisfied: rich in /usr/local/lib/python3.10/dist-packages (from trl) (13.9.4)\n",
1434
+ "Requirement already satisfied: typing-extensions>=3.7.4.3 in /usr/local/lib/python3.10/dist-packages (from huggingface_hub) (4.12.2)\n",
1435
+ "Requirement already satisfied: psutil in /usr/local/lib/python3.10/dist-packages (from accelerate>=0.34.0->trl) (5.9.5)\n",
1436
+ "Requirement already satisfied: torch>=1.10.0 in /usr/local/lib/python3.10/dist-packages (from accelerate>=0.34.0->trl) (2.5.1+cu121)\n",
1437
+ "Requirement already satisfied: aiohappyeyeballs>=2.3.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets) (2.4.4)\n",
1438
+ "Requirement already satisfied: aiosignal>=1.1.2 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets) (1.3.2)\n",
1439
+ "Requirement already satisfied: async-timeout<6.0,>=4.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets) (4.0.3)\n",
1440
+ "Requirement already satisfied: attrs>=17.3.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets) (24.3.0)\n",
1441
+ "Requirement already satisfied: frozenlist>=1.1.1 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets) (1.5.0)\n",
1442
+ "Requirement already satisfied: multidict<7.0,>=4.5 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets) (6.1.0)\n",
1443
+ "Requirement already satisfied: propcache>=0.2.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets) (0.2.1)\n",
1444
+ "Requirement already satisfied: yarl<2.0,>=1.17.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets) (1.18.3)\n",
1445
+ "Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.10/dist-packages (from requests->transformers) (3.4.1)\n",
1446
+ "Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/dist-packages (from requests->transformers) (3.10)\n",
1447
+ "Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.10/dist-packages (from requests->transformers) (2.3.0)\n",
1448
+ "Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/dist-packages (from requests->transformers) (2024.12.14)\n",
1449
+ "Requirement already satisfied: python-dateutil>=2.8.2 in /usr/local/lib/python3.10/dist-packages (from pandas->datasets) (2.8.2)\n",
1450
+ "Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.10/dist-packages (from pandas->datasets) (2024.2)\n",
1451
+ "Requirement already satisfied: tzdata>=2022.7 in /usr/local/lib/python3.10/dist-packages (from pandas->datasets) (2024.2)\n",
1452
+ "Requirement already satisfied: markdown-it-py>=2.2.0 in /usr/local/lib/python3.10/dist-packages (from rich->trl) (3.0.0)\n",
1453
+ "Requirement already satisfied: pygments<3.0.0,>=2.13.0 in /usr/local/lib/python3.10/dist-packages (from rich->trl) (2.18.0)\n",
1454
+ "Requirement already satisfied: mdurl~=0.1 in /usr/local/lib/python3.10/dist-packages (from markdown-it-py>=2.2.0->rich->trl) (0.1.2)\n",
1455
+ "Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.10/dist-packages (from python-dateutil>=2.8.2->pandas->datasets) (1.17.0)\n",
1456
+ "Requirement already satisfied: networkx in /usr/local/lib/python3.10/dist-packages (from torch>=1.10.0->accelerate>=0.34.0->trl) (3.4.2)\n",
1457
+ "Requirement already satisfied: jinja2 in /usr/local/lib/python3.10/dist-packages (from torch>=1.10.0->accelerate>=0.34.0->trl) (3.1.5)\n",
1458
+ "Requirement already satisfied: sympy==1.13.1 in /usr/local/lib/python3.10/dist-packages (from torch>=1.10.0->accelerate>=0.34.0->trl) (1.13.1)\n",
1459
+ "Requirement already satisfied: mpmath<1.4,>=1.1.0 in /usr/local/lib/python3.10/dist-packages (from sympy==1.13.1->torch>=1.10.0->accelerate>=0.34.0->trl) (1.3.0)\n",
1460
+ "Requirement already satisfied: MarkupSafe>=2.0 in /usr/local/lib/python3.10/dist-packages (from jinja2->torch>=1.10.0->accelerate>=0.34.0->trl) (3.0.2)\n",
1461
+ "Downloading datasets-3.2.0-py3-none-any.whl (480 kB)\n",
1462
+ "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m480.6/480.6 kB\u001b[0m \u001b[31m15.1 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
1463
+ "\u001b[?25hDownloading trl-0.13.0-py3-none-any.whl (293 kB)\n",
1464
+ "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m293.4/293.4 kB\u001b[0m \u001b[31m25.6 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
1465
+ "\u001b[?25hDownloading dill-0.3.8-py3-none-any.whl (116 kB)\n",
1466
+ "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m116.3/116.3 kB\u001b[0m \u001b[31m10.1 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
1467
+ "\u001b[?25hDownloading fsspec-2024.9.0-py3-none-any.whl (179 kB)\n",
1468
+ "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m179.3/179.3 kB\u001b[0m \u001b[31m17.1 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
1469
+ "\u001b[?25hDownloading multiprocess-0.70.16-py310-none-any.whl (134 kB)\n",
1470
+ "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m134.8/134.8 kB\u001b[0m \u001b[31m11.7 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
1471
+ "\u001b[?25hDownloading xxhash-3.5.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (194 kB)\n",
1472
+ "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m194.1/194.1 kB\u001b[0m \u001b[31m18.8 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
1473
+ "\u001b[?25hInstalling collected packages: xxhash, fsspec, dill, multiprocess, datasets, trl\n",
1474
+ " Attempting uninstall: fsspec\n",
1475
+ " Found existing installation: fsspec 2024.10.0\n",
1476
+ " Uninstalling fsspec-2024.10.0:\n",
1477
+ " Successfully uninstalled fsspec-2024.10.0\n",
1478
+ "\u001b[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.\n",
1479
+ "gcsfs 2024.10.0 requires fsspec==2024.10.0, but you have fsspec 2024.9.0 which is incompatible.\u001b[0m\u001b[31m\n",
1480
+ "\u001b[0mSuccessfully installed datasets-3.2.0 dill-0.3.8 fsspec-2024.9.0 multiprocess-0.70.16 trl-0.13.0 xxhash-3.5.0\n"
1481
+ ]
1482
+ }
1483
+ ],
1484
+ "source": [
1485
+ "# Install the requirements in Google Colab\n",
1486
+ "!pip install transformers datasets trl huggingface_hub"
1487
+ ]
1488
+ },
1489
+ {
1490
+ "cell_type": "code",
1491
+ "source": [
1492
+ "# Import necessary libraries\n",
1493
+ "from transformers import AutoModelForCausalLM, AutoTokenizer\n",
1494
+ "from datasets import load_dataset\n",
1495
+ "from trl import SFTConfig, SFTTrainer, setup_chat_format, DataCollatorForCompletionOnlyLM\n",
1496
+ "import torch"
1497
+ ],
1498
+ "metadata": {
1499
+ "id": "7iQRJ-YHDCQu"
1500
+ },
1501
+ "execution_count": 2,
1502
+ "outputs": []
1503
+ },
1504
+ {
1505
+ "cell_type": "code",
1506
+ "source": [
1507
+ "device = (\n",
1508
+ " \"cuda\"\n",
1509
+ " if torch.cuda.is_available()\n",
1510
+ " else \"mps\" if torch.backends.mps.is_available() else \"cpu\"\n",
1511
+ ")"
1512
+ ],
1513
+ "metadata": {
1514
+ "id": "VYETUQkNDIPz"
1515
+ },
1516
+ "execution_count": 3,
1517
+ "outputs": []
1518
+ },
1519
+ {
1520
+ "cell_type": "code",
1521
+ "source": [
1522
+ "# Load the model and tokenizer\n",
1523
+ "model_name = \"HuggingFaceTB/SmolLM2-135M\"\n",
1524
+ "model = AutoModelForCausalLM.from_pretrained(\n",
1525
+ " pretrained_model_name_or_path=model_name\n",
1526
+ ").to(device)\n",
1527
+ "tokenizer = AutoTokenizer.from_pretrained(pretrained_model_name_or_path=model_name)"
1528
+ ],
1529
+ "metadata": {
1530
+ "id": "olO-YsF-DMSh"
1531
+ },
1532
+ "execution_count": 37,
1533
+ "outputs": []
1534
+ },
1535
+ {
1536
+ "cell_type": "code",
1537
+ "source": [
1538
+ "# Set up the chat format\n",
1539
+ "model, tokenizer = setup_chat_format(model=model, tokenizer=tokenizer)"
1540
+ ],
1541
+ "metadata": {
1542
+ "id": "L2H0JHzBDTpm"
1543
+ },
1544
+ "execution_count": 38,
1545
+ "outputs": []
1546
+ },
1547
+ {
1548
+ "cell_type": "code",
1549
+ "source": [
1550
+ "# Set our name for the finetune to be saved &/ uploaded to\n",
1551
+ "finetune_name = \"SmolLM2-135M-SFT-smoltalk\"\n",
1552
+ "finetune_tags = [\"smol-course\",\"sft_finetuning\"]"
1553
+ ],
1554
+ "metadata": {
1555
+ "id": "YBiQ7YZPDhKe"
1556
+ },
1557
+ "execution_count": 39,
1558
+ "outputs": []
1559
+ },
1560
+ {
1561
+ "cell_type": "code",
1562
+ "source": [
1563
+ "# Let's test the base model before training\n",
1564
+ "prompt = \"Write a haiku about programming\"\n",
1565
+ "\n",
1566
+ "# Format with template\n",
1567
+ "messages = [{\"role\": \"user\", \"content\": prompt}]\n",
1568
+ "formatted_prompt = tokenizer.apply_chat_template(messages, tokenize=False)\n",
1569
+ "\n",
1570
+ "# Generate response\n",
1571
+ "inputs = tokenizer(formatted_prompt, return_tensors=\"pt\").to(device)\n",
1572
+ "outputs = model.generate(**inputs, max_new_tokens=100)\n",
1573
+ "print(\"Before training:\")\n",
1574
+ "print(tokenizer.decode(outputs[0], skip_special_tokens=True))"
1575
+ ],
1576
+ "metadata": {
1577
+ "colab": {
1578
+ "base_uri": "https://localhost:8080/"
1579
+ },
1580
+ "id": "av7kKBd0DhkZ",
1581
+ "outputId": "27d49f1f-6f0b-4e77-d276-b5b482b8bfff"
1582
+ },
1583
+ "execution_count": 40,
1584
+ "outputs": [
1585
+ {
1586
+ "output_type": "stream",
1587
+ "name": "stdout",
1588
+ "text": [
1589
+ "Before training:\n",
1590
+ "user\n",
1591
+ "Write a haiku about programming\n",
1592
+ "Write a haiku about programming\n",
1593
+ "Write a haiku about programming\n",
1594
+ "Write a haiku about programming\n",
1595
+ "Write a haiku about programming\n",
1596
+ "Write a haiku about programming\n",
1597
+ "Write a haiku about programming\n",
1598
+ "Write a haiku about programming\n",
1599
+ "Write a haiku about programming\n",
1600
+ "Write a haiku about programming\n",
1601
+ "Write a haiku about programming\n",
1602
+ "Write a haiku about programming\n",
1603
+ "Write a haiku about programming\n",
1604
+ "Write a haiku about programming\n",
1605
+ "Write a haiku about programming\n",
1606
+ "Write a\n"
1607
+ ]
1608
+ }
1609
+ ]
1610
+ },
1611
+ {
1612
+ "cell_type": "code",
1613
+ "source": [
1614
+ "# Load a sample dataset\n",
1615
+ "from datasets import load_dataset\n",
1616
+ "ds = load_dataset(path=\"HuggingFaceTB/smoltalk\", name=\"everyday-conversations\")\n",
1617
+ "ds"
1618
+ ],
1619
+ "metadata": {
1620
+ "colab": {
1621
+ "base_uri": "https://localhost:8080/"
1622
+ },
1623
+ "id": "qyWuLJDjDmqK",
1624
+ "outputId": "0162aeae-94bf-4d73-ec2a-e53c35d96bb3"
1625
+ },
1626
+ "execution_count": 41,
1627
+ "outputs": [
1628
+ {
1629
+ "output_type": "execute_result",
1630
+ "data": {
1631
+ "text/plain": [
1632
+ "DatasetDict({\n",
1633
+ " train: Dataset({\n",
1634
+ " features: ['full_topic', 'messages'],\n",
1635
+ " num_rows: 2260\n",
1636
+ " })\n",
1637
+ " test: Dataset({\n",
1638
+ " features: ['full_topic', 'messages'],\n",
1639
+ " num_rows: 119\n",
1640
+ " })\n",
1641
+ "})"
1642
+ ]
1643
+ },
1644
+ "metadata": {},
1645
+ "execution_count": 41
1646
+ }
1647
+ ]
1648
+ },
1649
+ {
1650
+ "cell_type": "code",
1651
+ "source": [
1652
+ "ds['train'][0]"
1653
+ ],
1654
+ "metadata": {
1655
+ "colab": {
1656
+ "base_uri": "https://localhost:8080/"
1657
+ },
1658
+ "id": "_6D93RBmDxbk",
1659
+ "outputId": "1540bcb9-8191-44cd-b2ee-51abb6d0807a"
1660
+ },
1661
+ "execution_count": 42,
1662
+ "outputs": [
1663
+ {
1664
+ "output_type": "execute_result",
1665
+ "data": {
1666
+ "text/plain": [
1667
+ "{'full_topic': 'Travel/Vacation destinations/Beach resorts',\n",
1668
+ " 'messages': [{'content': 'Hi there', 'role': 'user'},\n",
1669
+ " {'content': 'Hello! How can I help you today?', 'role': 'assistant'},\n",
1670
+ " {'content': \"I'm looking for a beach resort for my next vacation. Can you recommend some popular ones?\",\n",
1671
+ " 'role': 'user'},\n",
1672
+ " {'content': \"Some popular beach resorts include Maui in Hawaii, the Maldives, and the Bahamas. They're known for their beautiful beaches and crystal-clear waters.\",\n",
1673
+ " 'role': 'assistant'},\n",
1674
+ " {'content': 'That sounds great. Are there any resorts in the Caribbean that are good for families?',\n",
1675
+ " 'role': 'user'},\n",
1676
+ " {'content': 'Yes, the Turks and Caicos Islands and Barbados are excellent choices for family-friendly resorts in the Caribbean. They offer a range of activities and amenities suitable for all ages.',\n",
1677
+ " 'role': 'assistant'},\n",
1678
+ " {'content': \"Okay, I'll look into those. Thanks for the recommendations!\",\n",
1679
+ " 'role': 'user'},\n",
1680
+ " {'content': \"You're welcome. I hope you find the perfect resort for your vacation.\",\n",
1681
+ " 'role': 'assistant'}]}"
1682
+ ]
1683
+ },
1684
+ "metadata": {},
1685
+ "execution_count": 42
1686
+ }
1687
+ ]
1688
+ },
1689
+ {
1690
+ "cell_type": "code",
1691
+ "source": [
1692
+ "def process_messages(samples):\n",
1693
+ " # Add 'human' role logic\n",
1694
+ " result = []\n",
1695
+ " for x in samples['messages']:\n",
1696
+ " if x[-1]['role'] == 'user': # Add condition for 'human' role\n",
1697
+ " result.append(x)\n",
1698
+ " else:\n",
1699
+ " result.append(x[:-1]) # Truncate the message if condition is not met\n",
1700
+ " return {'messages': result}\n",
1701
+ "\n",
1702
+ "# Applying the function on a dataset\n",
1703
+ "dataset = ds.map(process_messages, batched=True)"
1704
+ ],
1705
+ "metadata": {
1706
+ "id": "cSYoD4Y3FQdu"
1707
+ },
1708
+ "execution_count": 43,
1709
+ "outputs": []
1710
+ },
1711
+ {
1712
+ "cell_type": "code",
1713
+ "source": [
1714
+ "# Configure the SFTTrainer\n",
1715
+ "sft_config = SFTConfig(\n",
1716
+ " output_dir=\"./sft_output\",\n",
1717
+ " max_steps=500, # Adjust based on dataset size and desired training duration\n",
1718
+ " per_device_train_batch_size=16, # Set according to your GPU memory capacity\n",
1719
+ " learning_rate=5e-5, # Common starting point for fine-tuning\n",
1720
+ " logging_steps=50, # Frequency for finding training metrics\n",
1721
+ " save_steps=50, # Frequency for saving model checkpoints\n",
1722
+ " eval_strategy=\"steps\", # Evaluate the model at regular intervals\n",
1723
+ " eval_steps=50, # Frequency of evaluation\n",
1724
+ " use_mps_device=(\n",
1725
+ " True if device == \"mps\" else False\n",
1726
+ " ), # Use MPS for mixed precision training\n",
1727
+ " hub_model_id=finetune_name, # Set a unique name for your model\n",
1728
+ " report_to=[]\n",
1729
+ ")\n",
1730
+ "\n",
1731
+ "# Initialize the SFTTrainer\n",
1732
+ "trainer = SFTTrainer(\n",
1733
+ " model=model,\n",
1734
+ " args=sft_config,\n",
1735
+ " train_dataset=ds[\"train\"],\n",
1736
+ " processing_class=tokenizer,\n",
1737
+ " eval_dataset=ds[\"test\"],\n",
1738
+ ")"
1739
+ ],
1740
+ "metadata": {
1741
+ "colab": {
1742
+ "base_uri": "https://localhost:8080/",
1743
+ "height": 49,
1744
+ "referenced_widgets": [
1745
+ "5479c909ae014cb4af686f98dfd896cb",
1746
+ "77cd1a0126ea48f6bc358c52b5b22f3e",
1747
+ "452ca86ad5f44319a93031962f10fedf",
1748
+ "fa32534d4cf349d2a1efac6496277d2e",
1749
+ "cb16cc0f8c6847cda083dc0b2f186083",
1750
+ "3b59793ca26745ea8b3780e07447a56c",
1751
+ "bf13f411d2db420fb25d13d212ba257a",
1752
+ "075f0860a2ca452c8b1438aeae8e3d18",
1753
+ "555d876e65c9419b82633841914086d9",
1754
+ "e07fb7d992e04473ad7bac9248e7aa75",
1755
+ "4cf7dbf2cb6d4581a356970de1996ec6"
1756
+ ]
1757
+ },
1758
+ "id": "nvtJ2H41JTrU",
1759
+ "outputId": "0a9eaf66-f20d-4a44-d54f-6f6428cab4f1"
1760
+ },
1761
+ "execution_count": 44,
1762
+ "outputs": [
1763
+ {
1764
+ "output_type": "display_data",
1765
+ "data": {
1766
+ "text/plain": [
1767
+ "Map: 0%| | 0/119 [00:00<?, ? examples/s]"
1768
+ ],
1769
+ "application/vnd.jupyter.widget-view+json": {
1770
+ "version_major": 2,
1771
+ "version_minor": 0,
1772
+ "model_id": "5479c909ae014cb4af686f98dfd896cb"
1773
+ }
1774
+ },
1775
+ "metadata": {}
1776
+ }
1777
+ ]
1778
+ },
1779
+ {
1780
+ "cell_type": "code",
1781
+ "source": [
1782
+ "# Train the model\n",
1783
+ "trainer.train()"
1784
+ ],
1785
+ "metadata": {
1786
+ "colab": {
1787
+ "base_uri": "https://localhost:8080/",
1788
+ "height": 441
1789
+ },
1790
+ "id": "qF2OxgBoJXKO",
1791
+ "outputId": "5a505e90-7213-4a80-a0b9-54338b83eadf"
1792
+ },
1793
+ "execution_count": 45,
1794
+ "outputs": [
1795
+ {
1796
+ "output_type": "display_data",
1797
+ "data": {
1798
+ "text/plain": [
1799
+ "<IPython.core.display.HTML object>"
1800
+ ],
1801
+ "text/html": [
1802
+ "\n",
1803
+ " <div>\n",
1804
+ " \n",
1805
+ " <progress value='500' max='500' style='width:300px; height:20px; vertical-align: middle;'></progress>\n",
1806
+ " [500/500 17:08, Epoch 3/4]\n",
1807
+ " </div>\n",
1808
+ " <table border=\"1\" class=\"dataframe\">\n",
1809
+ " <thead>\n",
1810
+ " <tr style=\"text-align: left;\">\n",
1811
+ " <th>Step</th>\n",
1812
+ " <th>Training Loss</th>\n",
1813
+ " <th>Validation Loss</th>\n",
1814
+ " </tr>\n",
1815
+ " </thead>\n",
1816
+ " <tbody>\n",
1817
+ " <tr>\n",
1818
+ " <td>50</td>\n",
1819
+ " <td>1.250500</td>\n",
1820
+ " <td>1.109389</td>\n",
1821
+ " </tr>\n",
1822
+ " <tr>\n",
1823
+ " <td>100</td>\n",
1824
+ " <td>1.065500</td>\n",
1825
+ " <td>1.067429</td>\n",
1826
+ " </tr>\n",
1827
+ " <tr>\n",
1828
+ " <td>150</td>\n",
1829
+ " <td>1.015600</td>\n",
1830
+ " <td>1.044449</td>\n",
1831
+ " </tr>\n",
1832
+ " <tr>\n",
1833
+ " <td>200</td>\n",
1834
+ " <td>0.895800</td>\n",
1835
+ " <td>1.034744</td>\n",
1836
+ " </tr>\n",
1837
+ " <tr>\n",
1838
+ " <td>250</td>\n",
1839
+ " <td>0.881400</td>\n",
1840
+ " <td>1.030149</td>\n",
1841
+ " </tr>\n",
1842
+ " <tr>\n",
1843
+ " <td>300</td>\n",
1844
+ " <td>0.862100</td>\n",
1845
+ " <td>1.029914</td>\n",
1846
+ " </tr>\n",
1847
+ " <tr>\n",
1848
+ " <td>350</td>\n",
1849
+ " <td>0.788500</td>\n",
1850
+ " <td>1.028845</td>\n",
1851
+ " </tr>\n",
1852
+ " <tr>\n",
1853
+ " <td>400</td>\n",
1854
+ " <td>0.789500</td>\n",
1855
+ " <td>1.027438</td>\n",
1856
+ " </tr>\n",
1857
+ " <tr>\n",
1858
+ " <td>450</td>\n",
1859
+ " <td>0.767000</td>\n",
1860
+ " <td>1.030825</td>\n",
1861
+ " </tr>\n",
1862
+ " <tr>\n",
1863
+ " <td>500</td>\n",
1864
+ " <td>0.741700</td>\n",
1865
+ " <td>1.031717</td>\n",
1866
+ " </tr>\n",
1867
+ " </tbody>\n",
1868
+ "</table><p>"
1869
+ ]
1870
+ },
1871
+ "metadata": {}
1872
+ },
1873
+ {
1874
+ "output_type": "execute_result",
1875
+ "data": {
1876
+ "text/plain": [
1877
+ "TrainOutput(global_step=500, training_loss=0.9057471542358398, metrics={'train_runtime': 1030.1888, 'train_samples_per_second': 7.766, 'train_steps_per_second': 0.485, 'total_flos': 1302438402256896.0, 'train_loss': 0.9057471542358398, 'epoch': 3.52112676056338})"
1878
+ ]
1879
+ },
1880
+ "metadata": {},
1881
+ "execution_count": 45
1882
+ }
1883
+ ]
1884
+ },
1885
+ {
1886
+ "cell_type": "code",
1887
+ "source": [
1888
+ "# Save the model\n",
1889
+ "trainer.save_model(f\"./{finetune_name}\")"
1890
+ ],
1891
+ "metadata": {
1892
+ "id": "NJAdU1QBJfXK"
1893
+ },
1894
+ "execution_count": 46,
1895
+ "outputs": []
1896
+ },
1897
+ {
1898
+ "cell_type": "code",
1899
+ "source": [
1900
+ "trainer.push_to_hub(tags=finetune_tags)"
1901
+ ],
1902
+ "metadata": {
1903
+ "colab": {
1904
+ "base_uri": "https://localhost:8080/",
1905
+ "height": 200,
1906
+ "referenced_widgets": [
1907
+ "f13451f32d11428c97243b2bc5b5268c",
1908
+ "02da0b8c8e564c05a91e5dd3b7e1c9cb",
1909
+ "43e70aa073504a4496e543fd923b7b0f",
1910
+ "7056a43eec3a4718b160360ce9493409",
1911
+ "348abc5e3efe4fd69aee2329d91c17da",
1912
+ "4ededf6d88554f5495f76521f8be98d5",
1913
+ "46012658060c41e2a2af258e4025499e",
1914
+ "7c1de009823e492fb9a1648b393b449c",
1915
+ "c4af106fdac144d3b3010ceffc63b4d7",
1916
+ "b9b5a37d22644cacbc3e81a1682c5184",
1917
+ "390ace0ab34e4727824de62d724767a5",
1918
+ "d2eaa596146742e4937f7b1ec7320dbb",
1919
+ "34f9fa16d57247829d00de248eb0e1ae",
1920
+ "d40e8b6c471e4f4da2f7521955155955",
1921
+ "82fd1f3b07f24c4c8a89454477ab6b97",
1922
+ "b5b509b3f7bc4ad4b994c68317a07228",
1923
+ "e27f78d5d126483cb7be7c9e69913954",
1924
+ "9b2a21eb504d49df8cbfd2d6bff0ddbf",
1925
+ "43985ed44fc14c0dad5cf7cff9a47d40",
1926
+ "59115af19cbe4420b6b1c12bf406f6ed",
1927
+ "e5b320112ff346529e6a936eab29d95d",
1928
+ "4a78c564e91c4f048778f65e4905b22a",
1929
+ "86b0050f0417457784489f96c1313b4a",
1930
+ "9196c063ec4741fcbc40b3c77eadc81e",
1931
+ "b10f037ce64143cbbf713148a46c2a0d",
1932
+ "338fcb1a304341759ce1d53093b48c8e",
1933
+ "461f8d72aa934cb499b8c09031682a6d",
1934
+ "82e726ec9ba24dd79c3b02388634d08d",
1935
+ "9a4d24decc8e4bf0a066bbd5a968c3b2",
1936
+ "9c08523e49844618953ab8dd28b9f6fd",
1937
+ "d578d568fd4f4ac6aba0b5de3d1a2c1d",
1938
+ "60f3ec480a6549908acc753e67b9c3f4",
1939
+ "67cd4b245e02485b82f422c187cec1ae"
1940
+ ]
1941
+ },
1942
+ "id": "yi9aab28Pk-C",
1943
+ "outputId": "6516478f-8af7-4a87-b52d-13128a0399fb"
1944
+ },
1945
+ "execution_count": 47,
1946
+ "outputs": [
1947
+ {
1948
+ "output_type": "display_data",
1949
+ "data": {
1950
+ "text/plain": [
1951
+ "model.safetensors: 0%| | 0.00/538M [00:00<?, ?B/s]"
1952
+ ],
1953
+ "application/vnd.jupyter.widget-view+json": {
1954
+ "version_major": 2,
1955
+ "version_minor": 0,
1956
+ "model_id": "f13451f32d11428c97243b2bc5b5268c"
1957
+ }
1958
+ },
1959
+ "metadata": {}
1960
+ },
1961
+ {
1962
+ "output_type": "display_data",
1963
+ "data": {
1964
+ "text/plain": [
1965
+ "Upload 2 LFS files: 0%| | 0/2 [00:00<?, ?it/s]"
1966
+ ],
1967
+ "application/vnd.jupyter.widget-view+json": {
1968
+ "version_major": 2,
1969
+ "version_minor": 0,
1970
+ "model_id": "d2eaa596146742e4937f7b1ec7320dbb"
1971
+ }
1972
+ },
1973
+ "metadata": {}
1974
+ },
1975
+ {
1976
+ "output_type": "display_data",
1977
+ "data": {
1978
+ "text/plain": [
1979
+ "training_args.bin: 0%| | 0.00/5.62k [00:00<?, ?B/s]"
1980
+ ],
1981
+ "application/vnd.jupyter.widget-view+json": {
1982
+ "version_major": 2,
1983
+ "version_minor": 0,
1984
+ "model_id": "86b0050f0417457784489f96c1313b4a"
1985
+ }
1986
+ },
1987
+ "metadata": {}
1988
+ },
1989
+ {
1990
+ "output_type": "execute_result",
1991
+ "data": {
1992
+ "text/plain": [
1993
+ "CommitInfo(commit_url='https://huggingface.co/ParitKansal/SmolLM2-135M-SFT-smoltalk/commit/97e8fed11e0a365f181dc40fc9b8ab4a87a98e99', commit_message='End of training', commit_description='', oid='97e8fed11e0a365f181dc40fc9b8ab4a87a98e99', pr_url=None, repo_url=RepoUrl('https://huggingface.co/ParitKansal/SmolLM2-135M-SFT-smoltalk', endpoint='https://huggingface.co', repo_type='model', repo_id='ParitKansal/SmolLM2-135M-SFT-smoltalk'), pr_revision=None, pr_num=None)"
1994
+ ],
1995
+ "application/vnd.google.colaboratory.intrinsic+json": {
1996
+ "type": "string"
1997
+ }
1998
+ },
1999
+ "metadata": {},
2000
+ "execution_count": 47
2001
+ }
2002
+ ]
2003
+ },
2004
+ {
2005
+ "cell_type": "code",
2006
+ "source": [
2007
+ "# Test the fine-tuned model on the same prompt\n",
2008
+ "\n",
2009
+ "# Let's test the base model before training\n",
2010
+ "prompt = \"Write about a programming lang\"\n",
2011
+ "\n",
2012
+ "# Format with template\n",
2013
+ "messages = [{\"role\": \"user\", \"content\": prompt}]\n",
2014
+ "formatted_prompt = tokenizer.apply_chat_template(messages, tokenize=False)\n",
2015
+ "\n",
2016
+ "# Generate response\n",
2017
+ "inputs = tokenizer(formatted_prompt, return_tensors=\"pt\").to(device)\n",
2018
+ "\n",
2019
+ "# TODO: use the fine-tuned to model generate a response, just like with the base example.\n",
2020
+ "outputs = model.generate(**inputs, max_new_tokens=100)\n",
2021
+ "print(\"After training:\")\n",
2022
+ "print(tokenizer.decode(outputs[0], skip_special_tokens=True))"
2023
+ ],
2024
+ "metadata": {
2025
+ "colab": {
2026
+ "base_uri": "https://localhost:8080/"
2027
+ },
2028
+ "id": "7xNFWx2HPsK7",
2029
+ "outputId": "0e29193f-be97-48bd-ca71-28be0c7b10e4"
2030
+ },
2031
+ "execution_count": 49,
2032
+ "outputs": [
2033
+ {
2034
+ "output_type": "stream",
2035
+ "name": "stdout",
2036
+ "text": [
2037
+ "After training:\n",
2038
+ "user\n",
2039
+ "Write about a programming lang\n",
2040
+ "\n",
2041
+ "What is a programming language?\n",
2042
+ "\n",
2043
+ "A programming language is a set of instructions that a computer can understand and execute. It is a set of rules that tells the computer what to do. It is a language that is easy to learn and use.\n",
2044
+ "\n",
2045
+ "What is a programming language used for?\n",
2046
+ "\n",
2047
+ "A programming language is used to create software programs. It is a language that is used to create computer programs. It is a language that is easy to learn and use.\n",
2048
+ "\n",
2049
+ "What\n"
2050
+ ]
2051
+ }
2052
+ ]
2053
+ },
2054
+ {
2055
+ "cell_type": "code",
2056
+ "source": [],
2057
+ "metadata": {
2058
+ "id": "DEBtbcL_Vc88"
2059
+ },
2060
+ "execution_count": null,
2061
+ "outputs": []
2062
+ }
2063
+ ]
2064
+ }