asimokby commited on
Commit
4dbc7fd
·
verified ·
1 Parent(s): 9016fa4

asimokby/nllb-finetuned-de-en

Browse files
Files changed (5) hide show
  1. README.md +20 -16
  2. generation_config.json +1 -1
  3. model.safetensors +1 -1
  4. tokenizer.json +2 -2
  5. training_args.bin +2 -2
README.md CHANGED
@@ -5,18 +5,18 @@ base_model: facebook/nllb-200-distilled-600M
5
  tags:
6
  - generated_from_trainer
7
  model-index:
8
- - name: german
9
  results: []
10
  ---
11
 
12
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
13
  should probably proofread and complete it, then remove this comment. -->
14
 
15
- # german
16
 
17
- This model is a fine-tuned version of [facebook/nllb-200-distilled-600M](https://huggingface.co/facebook/nllb-200-distilled-600M) on the on the [TEDx dataset](https://huggingface.co/datasets/IWSLT/ted_talks_iwslt).
18
  It achieves the following results on the evaluation set:
19
- - Loss: 0.6058
20
 
21
  ## Model description
22
 
@@ -41,27 +41,31 @@ The following hyperparameters were used during training:
41
  - seed: 42
42
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
43
  - lr_scheduler_type: linear
44
- - num_epochs: 10
45
 
46
  ### Training results
47
 
48
  | Training Loss | Epoch | Step | Validation Loss |
49
  |:-------------:|:-----:|:----:|:---------------:|
50
- | 1.1885 | 1.0 | 503 | 0.9689 |
51
- | 0.9215 | 2.0 | 1006 | 0.8678 |
52
- | 0.8009 | 3.0 | 1509 | 0.7949 |
53
- | 0.7114 | 4.0 | 2012 | 0.7384 |
54
- | 0.6442 | 5.0 | 2515 | 0.6946 |
55
- | 0.5896 | 6.0 | 3018 | 0.6614 |
56
- | 0.5534 | 7.0 | 3521 | 0.6380 |
57
- | 0.5216 | 8.0 | 4024 | 0.6196 |
58
- | 0.5005 | 9.0 | 4527 | 0.6091 |
59
- | 0.4914 | 10.0 | 5030 | 0.6058 |
 
 
 
 
 
60
 
61
 
62
  ### Framework versions
63
 
64
  - Transformers 4.44.2
65
  - Pytorch 2.5.0+cu121
66
- - Datasets 3.1.0
67
  - Tokenizers 0.19.1
 
5
  tags:
6
  - generated_from_trainer
7
  model-index:
8
+ - name: nllb-finetuned-de-en
9
  results: []
10
  ---
11
 
12
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
13
  should probably proofread and complete it, then remove this comment. -->
14
 
15
+ # nllb-finetuned-de-en
16
 
17
+ This model is a fine-tuned version of [facebook/nllb-200-distilled-600M](https://huggingface.co/facebook/nllb-200-distilled-600M) on an unknown dataset.
18
  It achieves the following results on the evaluation set:
19
+ - Loss: 0.5475
20
 
21
  ## Model description
22
 
 
41
  - seed: 42
42
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
43
  - lr_scheduler_type: linear
44
+ - num_epochs: 15
45
 
46
  ### Training results
47
 
48
  | Training Loss | Epoch | Step | Validation Loss |
49
  |:-------------:|:-----:|:----:|:---------------:|
50
+ | No log | 1.0 | 486 | 1.1181 |
51
+ | 1.3516 | 2.0 | 972 | 0.9976 |
52
+ | 1.0459 | 3.0 | 1458 | 0.9073 |
53
+ | 0.9073 | 4.0 | 1944 | 0.8393 |
54
+ | 0.7901 | 5.0 | 2430 | 0.7772 |
55
+ | 0.7072 | 6.0 | 2916 | 0.7292 |
56
+ | 0.6262 | 7.0 | 3402 | 0.6872 |
57
+ | 0.5713 | 8.0 | 3888 | 0.6532 |
58
+ | 0.5185 | 9.0 | 4374 | 0.6228 |
59
+ | 0.48 | 10.0 | 4860 | 0.5997 |
60
+ | 0.4424 | 11.0 | 5346 | 0.5795 |
61
+ | 0.4242 | 12.0 | 5832 | 0.5646 |
62
+ | 0.4049 | 13.0 | 6318 | 0.5558 |
63
+ | 0.3822 | 14.0 | 6804 | 0.5492 |
64
+ | 0.3792 | 15.0 | 7290 | 0.5475 |
65
 
66
 
67
  ### Framework versions
68
 
69
  - Transformers 4.44.2
70
  - Pytorch 2.5.0+cu121
 
71
  - Tokenizers 0.19.1
generation_config.json CHANGED
@@ -3,7 +3,7 @@
3
  "bos_token_id": 0,
4
  "decoder_start_token_id": 2,
5
  "eos_token_id": 2,
6
- "max_length": 256,
7
  "pad_token_id": 1,
8
  "transformers_version": "4.44.2"
9
  }
 
3
  "bos_token_id": 0,
4
  "decoder_start_token_id": 2,
5
  "eos_token_id": 2,
6
+ "max_length": 200,
7
  "pad_token_id": 1,
8
  "transformers_version": "4.44.2"
9
  }
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:a762d1f5e0e9942a6fed214df42d992c238a778e40f248c1f39ece102ac91cb2
3
  size 2460354912
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:936fbefecea41f74c69ba5169ec9f47b6547fc9689d3f5dd3e1482253d0c42dd
3
  size 2460354912
tokenizer.json CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:4f654c0c3f1c29080eb57a95498a1a057870727df51a0ee3efd7b1da5c61f774
3
- size 17331359
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7c0c8fca562d89c5dd1fb89f61b88a6208eeb89d4f5d1223ec54a2e827631bc0
3
+ size 17331261
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:e118795a46bf271e60de2b6cc574a46ae76d93d8870a1ef8382fbcd14791550b
3
- size 5368
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:25a0608b6e32e88792b36881729a76ce1e12d0e7ff42072dd4e7482237a332ba
3
+ size 5432