File size: 4,509 Bytes
7e5459e
786db6f
 
 
 
 
 
 
7e5459e
786db6f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
---
license: apache-2.0
base_model: eslamxm/mt5-base-finetuned-arur
tags:
- generated_from_trainer
model-index:
- name: T6
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# T6

This model is a fine-tuned version of [eslamxm/mt5-base-finetuned-arur](https://huggingface.co/eslamxm/mt5-base-finetuned-arur) on the None dataset.
It achieves the following results on the evaluation set:
- Loss: 0.5941

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 64

### Training results

| Training Loss | Epoch | Step | Validation Loss |
|:-------------:|:-----:|:----:|:---------------:|
| 0.2591        | 1.0   | 37   | 0.2616          |
| 0.1639        | 2.0   | 74   | 0.2497          |
| 0.1771        | 3.0   | 111  | 0.2448          |
| 0.1465        | 4.0   | 148  | 0.2486          |
| 0.1294        | 5.0   | 185  | 0.2499          |
| 0.118         | 6.0   | 222  | 0.2520          |
| 0.1014        | 7.0   | 259  | 0.2582          |
| 0.0986        | 8.0   | 296  | 0.2631          |
| 0.1021        | 9.0   | 333  | 0.2775          |
| 0.0783        | 10.0  | 370  | 0.2867          |
| 0.0699        | 11.0  | 407  | 0.2906          |
| 0.062         | 12.0  | 444  | 0.3010          |
| 0.059         | 13.0  | 481  | 0.3144          |
| 0.0592        | 14.0  | 518  | 0.3265          |
| 0.0513        | 15.0  | 555  | 0.3365          |
| 0.0404        | 16.0  | 592  | 0.3550          |
| 0.0417        | 17.0  | 629  | 0.3552          |
| 0.0385        | 18.0  | 666  | 0.3682          |
| 0.0303        | 19.0  | 703  | 0.3728          |
| 0.0355        | 20.0  | 740  | 0.3947          |
| 0.0232        | 21.0  | 777  | 0.4208          |
| 0.024         | 22.0  | 814  | 0.4080          |
| 0.023         | 23.0  | 851  | 0.4265          |
| 0.0169        | 24.0  | 888  | 0.4233          |
| 0.0185        | 25.0  | 925  | 0.4450          |
| 0.0214        | 26.0  | 962  | 0.4528          |
| 0.0159        | 27.0  | 999  | 0.4486          |
| 0.0156        | 28.0  | 1036 | 0.4926          |
| 0.017         | 29.0  | 1073 | 0.4927          |
| 0.0137        | 30.0  | 1110 | 0.4886          |
| 0.0139        | 31.0  | 1147 | 0.5205          |
| 0.0108        | 32.0  | 1184 | 0.4953          |
| 0.0136        | 33.0  | 1221 | 0.4925          |
| 0.0129        | 34.0  | 1258 | 0.5081          |
| 0.0099        | 35.0  | 1295 | 0.5252          |
| 0.0116        | 36.0  | 1332 | 0.5241          |
| 0.0134        | 37.0  | 1369 | 0.5352          |
| 0.0111        | 38.0  | 1406 | 0.5469          |
| 0.0089        | 39.0  | 1443 | 0.5618          |
| 0.0103        | 40.0  | 1480 | 0.5781          |
| 0.0083        | 41.0  | 1517 | 0.5896          |
| 0.0091        | 42.0  | 1554 | 0.5287          |
| 0.0115        | 43.0  | 1591 | 0.5556          |
| 0.0069        | 44.0  | 1628 | 0.5497          |
| 0.0069        | 45.0  | 1665 | 0.5896          |
| 0.0089        | 46.0  | 1702 | 0.5799          |
| 0.0056        | 47.0  | 1739 | 0.5654          |
| 0.0072        | 48.0  | 1776 | 0.5683          |
| 0.0097        | 49.0  | 1813 | 0.5642          |
| 0.0065        | 50.0  | 1850 | 0.5623          |
| 0.0073        | 51.0  | 1887 | 0.5906          |
| 0.0078        | 52.0  | 1924 | 0.5932          |
| 0.0068        | 53.0  | 1961 | 0.5923          |
| 0.006         | 54.0  | 1998 | 0.5978          |
| 0.005         | 55.0  | 2035 | 0.5846          |
| 0.0082        | 56.0  | 2072 | 0.5886          |
| 0.0081        | 57.0  | 2109 | 0.5844          |
| 0.0056        | 58.0  | 2146 | 0.5878          |
| 0.0069        | 59.0  | 2183 | 0.5890          |
| 0.0075        | 60.0  | 2220 | 0.5946          |
| 0.0077        | 61.0  | 2257 | 0.5897          |
| 0.0064        | 62.0  | 2294 | 0.5908          |
| 0.0049        | 63.0  | 2331 | 0.5934          |
| 0.005         | 64.0  | 2368 | 0.5941          |


### Framework versions

- Transformers 4.35.2
- Pytorch 2.1.0+cu121
- Datasets 2.16.1
- Tokenizers 0.15.1