|
--- |
|
language: |
|
- de |
|
tags: |
|
- question-generation |
|
- german |
|
- text2text-generation |
|
- generated_from_trainer |
|
datasets: |
|
- lmqg/qg_dequad |
|
metrics: |
|
- bleu4 |
|
- f1 |
|
- rouge |
|
- exact_match |
|
model-index: |
|
- name: german-jeopardy-longt5-base |
|
results: |
|
- task: |
|
name: Sequence-to-sequence Language Modeling |
|
type: text2text-generation |
|
dataset: |
|
name: lmqg/qg_dequad |
|
type: default |
|
args: default |
|
metrics: |
|
- name: BLEU-4 |
|
type: bleu4 |
|
value: 10.80 |
|
- name: F1 |
|
type: f1 |
|
value: 34.41 |
|
- name: ROUGE-1 |
|
type: rouge1 |
|
value: 35.31 |
|
- name: ROUGE-2 |
|
type: rouge2 |
|
value: 16.35 |
|
- name: ROUGE-L |
|
type: rougel |
|
value: 33.91 |
|
- name: ROUGE-Lsum |
|
type: rougelsum |
|
value: 33.96 |
|
- name: Exact Match |
|
type: exact_match |
|
value: 1.36 |
|
--- |
|
|
|
|
|
<!-- This model card has been generated automatically according to the information the Trainer had access to. You |
|
should probably proofread and complete it, then remove this comment. --> |
|
|
|
# german-jeopardy-longt5-base |
|
|
|
This model is a fine-tuned version of [google/long-t5-tglobal-base](https://huggingface.co/google/long-t5-tglobal-base) on the [lmqg/qg_dequad](https://huggingface.co/datasets/lmqg/qg_dequad) dataset. |
|
It achieves the following results on the evaluation set: |
|
- Loss: 1.8533 |
|
- Brevity Penalty: 0.8910 |
|
- System Length: 18642 |
|
- Reference Length: 20793 |
|
- ROUGE-1: 35.31 |
|
- ROUGE-2: 16.35 |
|
- ROUGE-L: 33.91 |
|
- ROUGE-Lsum: 33.96 |
|
- Exact Match: 1.36 |
|
- BLEU: 10.80 |
|
- F1: 34.41 |
|
|
|
## Model description |
|
|
|
See [google/long-t5-tglobal-base](https://huggingface.co/google/long-t5-tglobal-base) for more information about the |
|
model architecture. |
|
The model was trained on a single NVIDIA RTX 3090 GPU with 24GB of VRAM. |
|
|
|
## Intended uses & limitations |
|
|
|
This model can be used for question generation on German text. |
|
|
|
## Training and evaluation data |
|
|
|
See [lmqg/qg_dequad](https://huggingface.co/datasets/lmqg/qg_dequad). |
|
|
|
## Training procedure |
|
|
|
### Training hyperparameters |
|
|
|
The following hyperparameters were used during training: |
|
- learning_rate: 0.0001 |
|
- train_batch_size: 8 |
|
- eval_batch_size: 4 |
|
- seed: 7 |
|
- gradient_accumulation_steps: 8 |
|
- total_train_batch_size: 64 |
|
- optimizer: Adafactor |
|
- lr_scheduler_type: constant |
|
- num_epochs: 20 |
|
|
|
### Training results |
|
|
|
| Training Loss | Epoch | Step | BLEU | Brevity Penalty | Counts 1 | Counts 2 | Counts 3 | Counts 4 | Exact Match | F1 | Gen Len | Validation Loss | Precisions 1 | Precisions 2 | Precisions 3 | Precisions 4 | Reference Length | ROUGE-1 | ROUGE-2 | ROUGE-L | ROUGE-Lsum | System Length | Totals 1 | Totals 2 | Totals 3 | Totals 4 | |
|
|:-------------:|:-----:|:----:|:-------:|:---------------:|:--------:|:--------:|:--------:|:--------:|:-----------:|:------:|:-------:|:---------------:|:------------:|:------------:|:------------:|:------------:|:----------------:|:-------:|:-------:|:-------:|:----------:|:-------------:|:--------:|:--------:|:--------:|:--------:| |
|
| 3.1671 | 1.0 | 145 | 5.9441 | 0.7156 | 6177 | 1669 | 604 | 179 | 0.0023 | 0.2528 | 12.0218 | 2.1902 | 38.7954 | 12.1665 | 5.2458 | 1.9227 | 21250 | 0.2595 | 0.1035 | 0.2491 | 0.2492 | 15922 | 15922 | 13718 | 11514 | 9310 | |
|
| 2.5597 | 2.0 | 291 | 7.7787 | 0.7556 | 6785 | 2044 | 804 | 293 | 0.0064 | 0.2864 | 12.6084 | 2.0164 | 40.876 | 14.1994 | 6.595 | 2.9338 | 21250 | 0.2931 | 0.1291 | 0.2817 | 0.2818 | 16599 | 16599 | 14395 | 12191 | 9987 | |
|
| 2.3464 | 2.99 | 436 | 9.2407 | 0.7935 | 7251 | 2326 | 969 | 400 | 0.0073 | 0.3114 | 13.2296 | 1.9138 | 42.0129 | 15.45 | 7.5403 | 3.7569 | 21250 | 0.3162 | 0.1456 | 0.3031 | 0.3031 | 17259 | 17259 | 15055 | 12851 | 10647 | |
|
| 2.1679 | 4.0 | 582 | 9.6363 | 0.7795 | 7382 | 2393 | 1006 | 434 | 0.0109 | 0.3226 | 13.1207 | 1.8524 | 43.3903 | 16.1591 | 7.981 | 4.1727 | 21250 | 0.3272 | 0.1504 | 0.3147 | 0.3149 | 17013 | 17013 | 14809 | 12605 | 10401 | |
|
| 2.0454 | 5.0 | 728 | 10.3812 | 0.7665 | 7581 | 2555 | 1111 | 482 | 0.0132 | 0.3357 | 12.9782 | 1.7997 | 45.1599 | 17.5204 | 8.9749 | 4.7371 | 21250 | 0.3401 | 0.1606 | 0.3278 | 0.3279 | 16787 | 16787 | 14583 | 12379 | 10175 | |
|
| 1.9502 | 5.99 | 873 | 10.7668 | 0.7992 | 7759 | 2618 | 1162 | 511 | 0.0127 | 0.3406 | 13.4841 | 1.7696 | 44.6973 | 17.2748 | 8.9723 | 4.7548 | 21250 | 0.3452 | 0.1631 | 0.3321 | 0.3319 | 17359 | 17359 | 15155 | 12951 | 10747 | |
|
| 1.8414 | 7.0 | 1019 | 11.3408 | 0.7721 | 7791 | 2693 | 1236 | 570 | 0.015 | 0.347 | 13.0563 | 1.7472 | 46.147 | 18.3459 | 9.9078 | 5.5496 | 21250 | 0.3513 | 0.1679 | 0.3391 | 0.3391 | 16883 | 16883 | 14679 | 12475 | 10271 | |
|
| 1.7614 | 8.0 | 1165 | 11.8447 | 0.8198 | 8024 | 2799 | 1296 | 610 | 0.0145 | 0.352 | 13.515 | 1.7203 | 45.2643 | 18.0313 | 9.7305 | 5.4881 | 21250 | 0.3565 | 0.1711 | 0.3422 | 0.3423 | 17727 | 17727 | 15523 | 13319 | 11115 | |
|
| 1.6997 | 9.0 | 1310 | 11.9689 | 0.8027 | 8046 | 2835 | 1314 | 615 | 0.0168 | 0.3568 | 13.4306 | 1.7167 | 46.183 | 18.6293 | 10.0968 | 5.6892 | 21250 | 0.3613 | 0.1746 | 0.3466 | 0.3466 | 17422 | 17422 | 15218 | 13014 | 10810 | |
|
| 1.6159 | 10.0 | 1456 | 12.5678 | 0.8182 | 8087 | 2928 | 1395 | 681 | 0.0181 | 0.3564 | 13.5268 | 1.6892 | 45.6944 | 18.8976 | 10.4966 | 6.1429 | 21250 | 0.3612 | 0.1795 | 0.3485 | 0.3482 | 17698 | 17698 | 15494 | 13290 | 11086 | |
|
| 1.5681 | 10.99 | 1601 | 12.497 | 0.813 | 8154 | 2933 | 1383 | 664 | 0.0168 | 0.3605 | 13.6044 | 1.6923 | 46.3164 | 19.0442 | 10.4797 | 6.0402 | 21250 | 0.3654 | 0.1789 | 0.3506 | 0.3505 | 17605 | 17605 | 15401 | 13197 | 10993 | |
|
| 1.4987 | 12.0 | 1747 | 12.8959 | 0.8169 | 8295 | 3011 | 1432 | 697 | 0.0181 | 0.3675 | 13.6134 | 1.6825 | 46.928 | 19.461 | 10.7929 | 6.2997 | 21250 | 0.3734 | 0.1846 | 0.3576 | 0.3577 | 17676 | 17676 | 15472 | 13268 | 11064 | |
|
| 1.4461 | 13.0 | 1893 | 12.8688 | 0.8139 | 8246 | 3005 | 1424 | 700 | 0.0191 | 0.3658 | 13.5812 | 1.6784 | 46.7964 | 19.4915 | 10.7773 | 6.3584 | 21250 | 0.3725 | 0.1857 | 0.358 | 0.3576 | 17621 | 17621 | 15417 | 13213 | 11009 | |
|
| 1.4002 | 13.99 | 2038 | 13.4526 | 0.8329 | 8457 | 3130 | 1504 | 745 | 0.02 | 0.3727 | 13.9179 | 1.6725 | 47.0749 | 19.8591 | 11.0939 | 6.5621 | 21250 | 0.3797 | 0.1915 | 0.3637 | 0.3634 | 17965 | 17965 | 15761 | 13557 | 11353 | |
|
| 1.3391 | 15.0 | 2184 | 13.211 | 0.8283 | 8443 | 3091 | 1468 | 719 | 0.0204 | 0.3737 | 13.9133 | 1.6783 | 47.2177 | 19.7168 | 10.8959 | 6.3803 | 21250 | 0.3804 | 0.1901 | 0.3634 | 0.363 | 17881 | 17881 | 15677 | 13473 | 11269 | |
|
| 1.2921 | 16.0 | 2330 | 13.4907 | 0.8373 | 8457 | 3147 | 1511 | 747 | 0.0195 | 0.3716 | 13.9882 | 1.6738 | 46.8662 | 19.8662 | 11.0801 | 6.5337 | 21250 | 0.3782 | 0.1902 | 0.3624 | 0.3624 | 18045 | 18045 | 15841 | 13637 | 11433 | |
|
| 1.2572 | 17.0 | 2475 | 13.8581 | 0.8267 | 8473 | 3219 | 1561 | 783 | 0.02 | 0.3753 | 13.7618 | 1.6770 | 47.4598 | 20.57 | 11.6103 | 6.9656 | 21250 | 0.3821 | 0.1948 | 0.3669 | 0.3665 | 17853 | 17853 | 15649 | 13445 | 11241 | |
|
| 1.199 | 18.0 | 2621 | 13.7496 | 0.8326 | 8484 | 3190 | 1551 | 771 | 0.0186 | 0.3745 | 13.8798 | 1.6934 | 47.2409 | 20.2475 | 11.4456 | 6.7947 | 21250 | 0.3812 | 0.1922 | 0.3657 | 0.3658 | 17959 | 17959 | 15755 | 13551 | 11347 | |
|
| 1.1668 | 18.99 | 2766 | 13.7379 | 0.8395 | 8504 | 3179 | 1541 | 776 | 0.0204 | 0.376 | 13.9256 | 1.6926 | 47.0198 | 20.0164 | 11.2663 | 6.7631 | 21250 | 0.3828 | 0.1939 | 0.3665 | 0.3665 | 18086 | 18086 | 15882 | 13678 | 11474 | |
|
| 1.1164 | 19.91 | 2900 | 14.1906 | 0.8529 | 8625 | 3250 | 1609 | 820 | 0.0204 | 0.3803 | 14.069 | 1.7026 | 47.0463 | 20.15 | 11.5548 | 6.996 | 21250 | 0.3874 | 0.1964 | 0.3716 | 0.3715 | 18333 | 18333 | 16129 | 13925 | 11721 | |
|
|
|
|
|
### Framework versions |
|
|
|
- Transformers 4.34.1 |
|
- Pytorch 2.1.0 |
|
- Datasets 2.12.0 |
|
- Tokenizers 0.14.1 |
|
|