kimsan0622 commited on
Commit
f1b1177
·
1 Parent(s): 8064bed

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +89 -0
README.md ADDED
@@ -0,0 +1,89 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - generated_from_trainer
4
+ datasets:
5
+ - jsonl_dataset_sum.py
6
+ metrics:
7
+ - rouge
8
+ model-index:
9
+ - name: summarization_all
10
+ results:
11
+ - task:
12
+ name: Summarization
13
+ type: summarization
14
+ dataset:
15
+ name: jsonl_dataset_sum.py
16
+ type: jsonl_dataset_sum.py
17
+ config: 'null'
18
+ split: None
19
+ metrics:
20
+ - name: Rouge1
21
+ type: rouge
22
+ value: 21.9857
23
+ ---
24
+
25
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
26
+ should probably proofread and complete it, then remove this comment. -->
27
+
28
+ # summarization_all
29
+
30
+ This model is a fine-tuned version of [KETI-AIR/long-ke-t5-base](https://huggingface.co/KETI-AIR/long-ke-t5-base) on the jsonl_dataset_sum.py dataset.
31
+ It achieves the following results on the evaluation set:
32
+ - Loss: 1.1442
33
+ - Rouge1: 21.9857
34
+ - Rouge2: 10.2876
35
+ - Rougel: 21.4026
36
+ - Rougelsum: 21.4278
37
+ - Gen Len: 86.2560
38
+
39
+ ## Model description
40
+
41
+ More information needed
42
+
43
+ ## Intended uses & limitations
44
+
45
+ More information needed
46
+
47
+ ## Training and evaluation data
48
+
49
+ More information needed
50
+
51
+ ## Training procedure
52
+
53
+ ### Training hyperparameters
54
+
55
+ The following hyperparameters were used during training:
56
+ - learning_rate: 0.001
57
+ - train_batch_size: 1
58
+ - eval_batch_size: 1
59
+ - seed: 42
60
+ - distributed_type: multi-GPU
61
+ - num_devices: 8
62
+ - total_train_batch_size: 8
63
+ - total_eval_batch_size: 8
64
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
65
+ - lr_scheduler_type: linear
66
+ - num_epochs: 10.0
67
+
68
+ ### Training results
69
+
70
+ | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
71
+ |:-------------:|:-----:|:-------:|:---------------:|:-------:|:-------:|:-------:|:---------:|:-------:|
72
+ | 1.2503 | 1.0 | 184670 | 1.2439 | 20.2525 | 9.1467 | 19.7454 | 19.771 | 87.1766 |
73
+ | 1.1629 | 2.0 | 369340 | 1.1773 | 21.0068 | 9.6691 | 20.4565 | 20.4888 | 89.6074 |
74
+ | 1.1087 | 3.0 | 554010 | 1.1431 | 21.0216 | 9.6545 | 20.489 | 20.5108 | 85.5895 |
75
+ | 1.056 | 4.0 | 738680 | 1.1247 | 21.6776 | 10.1424 | 21.09 | 21.1168 | 89.6576 |
76
+ | 1.0199 | 5.0 | 923350 | 1.1179 | 21.6563 | 10.0965 | 21.0814 | 21.1056 | 89.2454 |
77
+ | 0.9652 | 6.0 | 1108020 | 1.1122 | 21.6209 | 10.0725 | 21.0623 | 21.0864 | 86.7079 |
78
+ | 0.92 | 7.0 | 1292690 | 1.1136 | 21.9396 | 10.2734 | 21.3465 | 21.3745 | 86.5547 |
79
+ | 0.8804 | 8.0 | 1477360 | 1.1228 | 21.8457 | 10.1858 | 21.2552 | 21.278 | 87.6413 |
80
+ | 0.8447 | 9.0 | 1662030 | 1.1327 | 21.92 | 10.2635 | 21.3415 | 21.3633 | 86.4453 |
81
+ | 0.7678 | 10.0 | 1846700 | 1.1442 | 21.9857 | 10.2876 | 21.4026 | 21.4278 | 86.2560 |
82
+
83
+
84
+ ### Framework versions
85
+
86
+ - Transformers 4.25.1
87
+ - Pytorch 1.12.0
88
+ - Datasets 2.8.0
89
+ - Tokenizers 0.13.2