rstodden commited on
Commit
f116882
·
verified ·
1 Parent(s): 0889e10

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +201 -0
README.md CHANGED
@@ -0,0 +1,201 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ datasets:
3
+ - DEplain/DEplain-APA-sent
4
+ language:
5
+ - de
6
+ metrics:
7
+ - sari
8
+ - bleu
9
+ - bertscore
10
+ library_name: transformers
11
+ pipeline_tag: text2text-generation
12
+ tags:
13
+ - text simplification
14
+ - plain language
15
+ - easy-to-read language
16
+ - sentence simplification
17
+ ---
18
+
19
+ # Model Card for mT5-DEplain-APA
20
+
21
+ <!-- Provide a quick summary of what the model is/does. -->
22
+
23
+ This model aims to simplify German texts into plain German language. It belongs to the experiments done at the work of Stodden (2024, to appear). "Reproduction & Benchmarking of German Text Simplification Systems" In Proceedings of the 1st Workshop on Evaluating Text Difficulty in a Multilingual Context (DeTermIt!), Turin, Italy.
24
+
25
+ ## Model Details
26
+
27
+ ### Model Description
28
+
29
+ <!-- Provide a longer summary of what this model is. -->
30
+
31
+
32
+
33
+ - **Developed by:** Regina Stodden
34
+ - **Model type:** Text2Text Generation
35
+ - **Language(s) (NLP):** German, Plain German
36
+ - **License:** [More Information Needed]
37
+ - **Finetuned from model [optional]:** [https://huggingface.co/google/mt5-base](https://huggingface.co/google/mt5-base)
38
+
39
+ ### Model Sources [optional]
40
+
41
+ <!-- Provide the basic links for the model. -->
42
+
43
+ - **Repository:** [https://huggingface.co/DEplain/mt5-DEplain-APA](https://huggingface.co/DEplain/mt5-DEplain-APA)
44
+ - **Paper [optional]:** Stodden (2024, to appear). "Reproduction & Benchmarking of German Text Simplification Systems" In Proceedings of the 1st Workshop on Evaluating Text Difficulty in a Multilingual Context (DeTermIt!), Turin, Italy.
45
+ - **Demo [optional]:** [More Information Needed]
46
+
47
+ ## Uses
48
+
49
+ <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
50
+
51
+ ### Direct Use & Downstream Use
52
+
53
+ <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
54
+
55
+ mT5-DEplain-APA is intended to be used to simplify German sentences for non-native German speakers.
56
+ mt5-DEplain-APA is a fine-tuned version of mT5-base, which is fine-tuned on [DEplain-APA-sent](https://huggingface.co/datasets/DEplain/DEplain-APA-sent), a German text simplification corpus of the news domain. The intended use is sentence simplification of German, where the source language is standard German and the target language is plain German.
57
+
58
+
59
+ ### Out-of-Scope Use
60
+
61
+ <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
62
+ mT5-DEplain-APA is fine-tuned only on complex-simple pairs of the news domains and for German learners (CEFR level: A2), hence, we assume that the model will not work well for other use cases than text simplification, other languages than German, other domains than news, nor other target groups than non-native speakers.
63
+
64
+
65
+ ## Bias, Risks, and Limitations
66
+
67
+ <!-- This section is meant to convey both technical and sociotechnical limitations. -->
68
+
69
+ The generated simplifications of the TS model might have some errors, therefore they shouldn’t be shown to a potentially vulnerable target group before manually verifying their quality and possibly fixing them.
70
+ The text simplification system could be provided to human translators who might improve and timely reduce their effort in manually simplifying a text.
71
+
72
+
73
+ ### Recommendations
74
+
75
+ <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
76
+
77
+ Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
78
+
79
+ ## How to Get Started with the Model
80
+
81
+ Use the code below to get started with the model. Please specify the maximum target length of the sequence to 128 to reproduce our results.
82
+
83
+ ```
84
+ from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
85
+
86
+ tokenizer = AutoTokenizer.from_pretrained("DEplain/mt5-DEplain-APA")
87
+ model = AutoModelForSeq2SeqLM.from_pretrained("DEplain/mt5-DEplain-APA")
88
+
89
+ prefix = "Simplify to plain German: "
90
+ sent = "Ganz vorne im Gespann zieht er die anderen 13 Hunde mit, führt sie über vereiste Seen oder steile Berge und findet den Weg, wenn ihn selbst der Musher nicht mehr kennt."
91
+ # EN: "At the front of the team, he pulls the other 13 dogs along, leads them over icy lakes or steep mountains and finds the way when even the musher no longer knows it."
92
+
93
+ inputs = tokenizer([prefix+sent], return_tensors="pt")
94
+ outputs = model.generate(**inputs, max_length=128)
95
+ print(tokenizer.batch_decode(outputs, skip_special_tokens=True)[0])
96
+
97
+ # expected output: "Ganz vorne im Gespann zieht er die anderen Hunde mit. Er findet den Weg, wenn ihn selbst der Musher nicht mehr kennt."
98
+
99
+ ```
100
+
101
+ ## Training Details
102
+
103
+ ### Training Data
104
+
105
+ <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
106
+
107
+ The model is fine-tuned on DEplain-APA. DEplain-APA [(Stodden et al., 2023)](https://aclanthology.org/2023.acl-long.908/) is a dataset for the training and evaluation of sentence simplification in German. All texts of this dataset are provided by the Austrian Press Agency. The simple-complex sentence pairs are manually aligned.
108
+
109
+ ### Training Procedure
110
+
111
+ <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
112
+
113
+
114
+
115
+
116
+ #### Training Hyperparameters
117
+
118
+ - **Training regime:** fp32
119
+ - **epochs**: 10
120
+ - **model**: mt5-base
121
+ - **prefix**: "simplify to plain German: "
122
+ - **max length**: 128:128
123
+ - **learning rate**: 0.001
124
+ - **batch size**: 4
125
+ - **metric**: SARI
126
+ - **optimzer**: adafactor
127
+
128
+ ## Evaluation
129
+
130
+ <!-- This section describes the evaluation protocols and provides the results. -->
131
+
132
+ ### Testing Data, Factors & Metrics
133
+
134
+ #### Testing Data
135
+
136
+ <!-- This should link to a Dataset Card if possible. -->
137
+
138
+ We mainly recommend to evaluate mT5-Deplain-APA on [DEplain-APA-sent](DEplain/DEplain-APA-sent). However, in our paper, we include evaluation on more test sets which can be found here: [https://github.com/rstodden/easse-de](https://github.com/rstodden/easse-de/tree/master/easse/resources/data/test_sets/sentence_level).
139
+
140
+ #### Factors
141
+
142
+ <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
143
+
144
+ [More Information Needed]
145
+
146
+ #### Metrics
147
+
148
+ <!-- These are the evaluation metrics being used, ideally with a description of why. -->
149
+ All models are automatically evaluated against one reference and on the same evaluation metrics, i.e., SARI (Xu et al., 2016), BLEU (Papineni et al., 2002), BS_P (Zhang* et al., 2020), and FRE (Amstad, 1978).
150
+ Following the recommendation of Alva-Manchego et al. (2021), we use BS_P as the main evaluation metric, if the score is a high we verify it with other metrics, i.e., SARI, BLEU and FRE.
151
+ In addition, as recommended by Tanprasert and Kauchak (2021) and Alva-Manchego et al. (2019), we also report linguistic features to get more insights into the system-generated simplifications, i.e., compression ratio and sentence splits.
152
+ For the measurement of the metrics and features, we are using the evaluation framework, i.e., EASSE-DE (Stodden, 2024) a multi-lingual adaptation of the EASSE evaluation framework.
153
+
154
+
155
+
156
+ ### Results
157
+ Results of mT5-DEplain-APA and related models evaluated on DEplain-APA. For more results on other test data, please have a look at our paper.
158
+
159
+ | | BLEU | SARI | BS_P | FRE | Compr- ratio | Sent. splits |
160
+ |-----------------------|--------|--------|------|-------|--------------|--------------|
161
+ | hda_LS | 22.3 | 26.06 | 0.55 | 64.60 | 1.00 | 1.00 |
162
+ | sockeye-APA-LHA | 11.84 | 40.16 | 0.37 | 63.70 | 0.94 | 0.97 |
163
+ | sockeye-DEplain-APA | 19.58 | 44.14 | 0.53 | 71.45 | 0.94 | 1.09 |
164
+ | mBART-DEplain-APA | 28.49 | 38.72 | 0.6} | 65.30 | 0.99 | 1.07 |
165
+ | mBART-DEplain-APA+web | 28.03 | 33.81 | 0.64 | 65.20 | 0.98 | 1.05 |
166
+ | **mT5-DEplain-APA** | 22.32 | 39.41 | 0.61 | 63.20 | 0.87 | 1.04 |
167
+ | mt5-SGC | 8.12 | 37.92 | 0.48 | 71.65 | 0.74 | 1.00 |
168
+ | BLOOM-zero | 16.14 | 35.43 | 0.53 | 65.10 | 0.87 | 1.14 |
169
+ | BLOOM-10-random | 17.97 | 35.93 | 0.57 | 65.50 | 0.91 | 1.00 |
170
+ | BLOOM-10-similarity | 20.97 | 41.27 | 0.57 | 65.70 | 0.93 | 1.07 |
171
+ | custom-decoder-ats | 1.24 | 36.42 | 0.16 | 53.00 | 7.41 | 5.07 |
172
+ | Identity baseline | 26.89 | 15.25 | 0.63 | 58.75 | 1.00 | 1.00 |
173
+ | Reference baseline | 100.00 | 100.00 | 1.00 | 65.80 | 1.03 | 1.20 |
174
+ | Truncate baseline | 16.11 | 27.20 | 0.55 | 66.10 | 0.80 | 1.01 |
175
+
176
+
177
+
178
+ ## Citation [optional]
179
+
180
+ <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
181
+
182
+ **BibTeX:**
183
+
184
+ ```
185
+ @inproceedings{stodden-2024-reproduction,
186
+ author = {Regina Stodden},
187
+ title = {{Reproduction \& Benchmark of German Text Simplification Systems}},
188
+ booktitle = "Proceedings of the 1st Workshop on Evaluating Text Difficulty in a Multilingual Context (DeTermIt!)",
189
+ year = {2024 (to appear)},
190
+ address = "Turino, Italy"
191
+ }
192
+ ```
193
+
194
+
195
+ **APA:**
196
+
197
+ Regina Stodden. 2024 (to appear). "Reproduction & Benchmarking of German Text Simplification Systems". In Proceedings of the Proceedings of the 1st Workshop on Evaluating Text Difficulty in a Multilingual Context (DeTermIt!), Turin, Italy.
198
+
199
+ ## Model Card Contact
200
+
201
+ if you have any question, please contact Regina Stodden ([email protected]).