PM-AI commited on
Commit
52e8c7c
·
1 Parent(s): b10d489

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -15
README.md CHANGED
@@ -2,8 +2,6 @@
2
  language:
3
  - de
4
  - en
5
- datasets:
6
- - todo
7
  pipeline_tag: sentence-similarity
8
  tags:
9
  - semantic textual similarity
@@ -14,11 +12,6 @@ tags:
14
  - sentence-transformer
15
  - feature-extraction
16
  - transformers
17
- task_categories:
18
- - sentence-similarity
19
- - feature-extraction
20
- - text-retrieval
21
- - other
22
  ---
23
 
24
  # Model card for PM-AI/sts_paraphrase_xlm-roberta-base_de-en
@@ -44,18 +37,18 @@ In terms of content, the samples are based on rather simple sentences.
44
 
45
  When the TSystems model was published, only the STSb dataset was used for STS training.
46
  Therefore it is included in our model, but expanded to include SICK and Priya22 semantic textual relatedness:
47
- - SICK was partly used in STSb, but our independent translation (XYZ) using [DeepL](https://www.deepl.com/) leads to slightly different phrases. This approach allows more examples to be included in the training.
48
  - The Priya22 semantic textual relatedness dataset published in 2022 was also translated into German via DeepL and added to the training data. Since it does not have a train-test-split, it was created independently at a ratio of 80:20.
49
  The rating scale of all datasets has been adjusted to STSb with a value range from 0 to 5.
50
  All training and test data (STSb, Sick, Priya22) were checked for duplicates within and with each other and removed if found.
51
  Because the test data is prioritized, duplicated entries between test-train are exclusively removed from train split.
52
- The final used datasets can be viewed here: XYZ.
53
 
54
  ### Training
55
  Befor fine-tuning for STS we made the English paraphrasing model [paraphrase-distilroberta-base-v1](https://huggingface.co/sentence-transformers/paraphrase-distilroberta-base-v1) usable for German by applying **[Knowledge Distillation](https://arxiv.org/abs/2004.09813)** (_Teacher-Student_ approach).
56
  The TSystems model used version 1, which is based on 7 different datasets and contains around 24.6 million samples.
57
  We are using version 2 with 12 datasets and about 83.3 million examples.
58
- Details for this process here: XYZ
59
 
60
  For fine-tuning we are using SBERT's [training_stsbenchmark_continue_training.py](https://github.com/UKPLab/sentence-transformers/blob/b86eec31cf0a102ad786ba1ff31bfeb4998d3ca5/examples/training/sts/training_stsbenchmark_continue_training.py) training script.
61
  One thing has been changed in this training script: when a sentence pair consists of identical utterances the score is set to 5.0 (maximum).
@@ -63,10 +56,10 @@ It makes no sense to say identical sentences have a score of 4.8 or 4.9.
63
 
64
  #### Parameterization of training
65
  - **Script:** [training_stsbenchmark_continue_training.py](https://github.com/UKPLab/sentence-transformers/blob/b86eec31cf0a102ad786ba1ff31bfeb4998d3ca5/examples/training/sts/training_stsbenchmark_continue_training.py)
66
- - **Datasets:** todo
67
  - **GPU:** NVIDIA A40 (Driver Version: 515.48.07; CUDA Version: 11.7)
68
  - **Batch Size:** 32
69
- - **Base Model:** todo
70
  - **Loss Function:** Cosine Similarity
71
  - **Learning Rate:** 2e-5
72
  - **Epochs:** 3
@@ -88,7 +81,7 @@ The first table shows the evaluation results for **cross-lingual (German-English
88
  :-----:|:-----:|:-----:|:-----:|:-----:
89
  [PM-AI/sts_paraphrase_xlm-roberta-base_de-en (ours)](https://huggingface.co/PM-AI/sts_paraphrase_xlm-roberta-base_de-en) | 0.8672 <br /> 🏆 | 0.8639 <br /> 🏆 | 0.8354 <br /> 🏆 | 0.8711 <br /> 🏆
90
  [T-Systems-onsite/cross-en-de-roberta-sentence-transformer](https://huggingface.co/T-Systems-onsite/cross-en-de-roberta-sentence-transformer) | 0.8525 | 0.7642 | 0.7998 | 0.8216
91
- [todo (ours, no fine-tuning)]() | 0.8225 | 0.7579 | 0.8255 | 0.8109
92
  [sentence-transformers/paraphrase-multilingual-mpnet-base-v2](https://huggingface.co/sentence-transformers/paraphrase-multilingual-mpnet-base-v2) | 0.8310 | 0.7529 | 0.8184 | 0.8102
93
  [sentence-transformers/stsb-xlm-r-multilingual](https://huggingface.co/sentence-transformers/stsb-xlm-r-multilingual) | 0.8194 | 0.7703 | 0.7566 | 0.7998
94
  [sentence-transformers/paraphrase-xlm-r-multilingual-v1](https://huggingface.co/sentence-transformers/paraphrase-xlm-r-multilingual-v1) | 0.7985 | 0.7217 | 0.7975 | 0.7838
@@ -114,7 +107,7 @@ The second table shows the evaluation results for **German only** based on _Spea
114
  [T-Systems-onsite/cross-en-de-roberta-sentence-transformer](https://huggingface.co/T-Systems-onsite/cross-en-de-roberta-sentence-transformer) | 0.8547 | 0.8047 | 0.8068 | 0.8327
115
  [Sahajtomar/German-semantic](https://huggingface.co/Sahajtomar/German-semantic) | 0.8485 | 0.7915 | 0.8139 | 0.8280
116
  [sentence-transformers/paraphrase-multilingual-mpnet-base-v2](https://huggingface.co/sentence-transformers/paraphrase-multilingual-mpnet-base-v2) | 0.8360 | 0.7941 | 0.8237 | 0.8178
117
- [todo (ours, no fine-tuning)]() | 0.8297 | 0.7930 | 0.8341 | 0.8170
118
  [sentence-transformers/stsb-xlm-r-multilingual](https://huggingface.co/sentence-transformers/stsb-xlm-r-multilingual) | 0.8190 | 0.8027 | 0.7674 | 0.8072
119
  [sentence-transformers/paraphrase-xlm-r-multilingual-v1](https://huggingface.co/sentence-transformers/paraphrase-xlm-r-multilingual-v1) | 0.8079 | 0.7844 | 0.8126 | 0.8034
120
  [sentence-transformers/xlm-r-distilroberta-base-paraphrase-v1](https://huggingface.co/sentence-transformers/xlm-r-distilroberta-base-paraphrase-v1) | 0.8079 | 0.7844 | 0.8126 | 0.8034
@@ -136,7 +129,7 @@ And last but not least our third table which shows the evaluation results for **
136
  :-----:|:-----:|:-----:|:-----:|:-----:
137
  [PM-AI/sts_paraphrase_xlm-roberta-base_de-en (ours)](https://huggingface.co/PM-AI/sts_paraphrase_xlm-roberta-base_de-en) | 0.8768 <br /> 🏆 | 0.8705 <br /> 🏆 | 0.8402 | 0.8748 <br /> 🏆
138
  [sentence-transformers/paraphrase-multilingual-mpnet-base-v2](https://huggingface.co/sentence-transformers/paraphrase-multilingual-mpnet-base-v2) | 0.8682 | 0.8065 | 0.8430 | 0.8378
139
- [todo (ours, no fine-tuning)]() | 0.8597 | 0.8105 | 0.8399 | 0.8363
140
  [T-Systems-onsite/cross-en-de-roberta-sentence-transformer](https://huggingface.co/T-Systems-onsite/cross-en-de-roberta-sentence-transformer) | 0.8660 | 0.7897 | 0.8097 | 0.8308
141
  [sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2](https://huggingface.co/sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2) | 0.8441 | 0.8059 | 0.8175 | 0.8300
142
  [sentence-transformers/sentence-t5-base](https://huggingface.co/sentence-transformers/sentence-t5-base) | 0.8551 | 0.8063 | 0.8434 | 0.8235
 
2
  language:
3
  - de
4
  - en
 
 
5
  pipeline_tag: sentence-similarity
6
  tags:
7
  - semantic textual similarity
 
12
  - sentence-transformer
13
  - feature-extraction
14
  - transformers
 
 
 
 
 
15
  ---
16
 
17
  # Model card for PM-AI/sts_paraphrase_xlm-roberta-base_de-en
 
37
 
38
  When the TSystems model was published, only the STSb dataset was used for STS training.
39
  Therefore it is included in our model, but expanded to include SICK and Priya22 semantic textual relatedness:
40
+ - SICK was partly used in STSb, but our custom translation using [DeepL](https://www.deepl.com/) leads to slightly different phrases. This approach allows more examples to be included in the training.
41
  - The Priya22 semantic textual relatedness dataset published in 2022 was also translated into German via DeepL and added to the training data. Since it does not have a train-test-split, it was created independently at a ratio of 80:20.
42
  The rating scale of all datasets has been adjusted to STSb with a value range from 0 to 5.
43
  All training and test data (STSb, Sick, Priya22) were checked for duplicates within and with each other and removed if found.
44
  Because the test data is prioritized, duplicated entries between test-train are exclusively removed from train split.
45
+ The final used datasets can be viewed here: [datasets_sts_paraphrase_xlm-roberta-base_de-en](https://gitlab.com/sense.ai.tion-public/datasets_sts_paraphrase_xlm-roberta-base_de-en)
46
 
47
  ### Training
48
  Befor fine-tuning for STS we made the English paraphrasing model [paraphrase-distilroberta-base-v1](https://huggingface.co/sentence-transformers/paraphrase-distilroberta-base-v1) usable for German by applying **[Knowledge Distillation](https://arxiv.org/abs/2004.09813)** (_Teacher-Student_ approach).
49
  The TSystems model used version 1, which is based on 7 different datasets and contains around 24.6 million samples.
50
  We are using version 2 with 12 datasets and about 83.3 million examples.
51
+ Details for this process here: [PM-AI/paraphrase-distilroberta-base-v2_de-en](https://huggingface.co/PM-AI/paraphrase-distilroberta-base-v2_de-en)
52
 
53
  For fine-tuning we are using SBERT's [training_stsbenchmark_continue_training.py](https://github.com/UKPLab/sentence-transformers/blob/b86eec31cf0a102ad786ba1ff31bfeb4998d3ca5/examples/training/sts/training_stsbenchmark_continue_training.py) training script.
54
  One thing has been changed in this training script: when a sentence pair consists of identical utterances the score is set to 5.0 (maximum).
 
56
 
57
  #### Parameterization of training
58
  - **Script:** [training_stsbenchmark_continue_training.py](https://github.com/UKPLab/sentence-transformers/blob/b86eec31cf0a102ad786ba1ff31bfeb4998d3ca5/examples/training/sts/training_stsbenchmark_continue_training.py)
59
+ - **Datasets:** [datasets_sts_paraphrase_xlm-roberta-base_de-en](https://gitlab.com/sense.ai.tion-public/datasets_sts_paraphrase_xlm-roberta-base_de-en)
60
  - **GPU:** NVIDIA A40 (Driver Version: 515.48.07; CUDA Version: 11.7)
61
  - **Batch Size:** 32
62
+ - **Base Model:** [PM-AI/paraphrase-distilroberta-base-v2_de-en](PM-AI/paraphrase-distilroberta-base-v2_de-en)
63
  - **Loss Function:** Cosine Similarity
64
  - **Learning Rate:** 2e-5
65
  - **Epochs:** 3
 
81
  :-----:|:-----:|:-----:|:-----:|:-----:
82
  [PM-AI/sts_paraphrase_xlm-roberta-base_de-en (ours)](https://huggingface.co/PM-AI/sts_paraphrase_xlm-roberta-base_de-en) | 0.8672 <br /> 🏆 | 0.8639 <br /> 🏆 | 0.8354 <br /> 🏆 | 0.8711 <br /> 🏆
83
  [T-Systems-onsite/cross-en-de-roberta-sentence-transformer](https://huggingface.co/T-Systems-onsite/cross-en-de-roberta-sentence-transformer) | 0.8525 | 0.7642 | 0.7998 | 0.8216
84
+ [PM-AI/paraphrase-distilroberta-base-v2_de-en (ours, no fine-tuning)](PM-AI/paraphrase-distilroberta-base-v2_de-en) | 0.8225 | 0.7579 | 0.8255 | 0.8109
85
  [sentence-transformers/paraphrase-multilingual-mpnet-base-v2](https://huggingface.co/sentence-transformers/paraphrase-multilingual-mpnet-base-v2) | 0.8310 | 0.7529 | 0.8184 | 0.8102
86
  [sentence-transformers/stsb-xlm-r-multilingual](https://huggingface.co/sentence-transformers/stsb-xlm-r-multilingual) | 0.8194 | 0.7703 | 0.7566 | 0.7998
87
  [sentence-transformers/paraphrase-xlm-r-multilingual-v1](https://huggingface.co/sentence-transformers/paraphrase-xlm-r-multilingual-v1) | 0.7985 | 0.7217 | 0.7975 | 0.7838
 
107
  [T-Systems-onsite/cross-en-de-roberta-sentence-transformer](https://huggingface.co/T-Systems-onsite/cross-en-de-roberta-sentence-transformer) | 0.8547 | 0.8047 | 0.8068 | 0.8327
108
  [Sahajtomar/German-semantic](https://huggingface.co/Sahajtomar/German-semantic) | 0.8485 | 0.7915 | 0.8139 | 0.8280
109
  [sentence-transformers/paraphrase-multilingual-mpnet-base-v2](https://huggingface.co/sentence-transformers/paraphrase-multilingual-mpnet-base-v2) | 0.8360 | 0.7941 | 0.8237 | 0.8178
110
+ [PM-AI/paraphrase-distilroberta-base-v2_de-en (ours, no fine-tuning)](PM-AI/paraphrase-distilroberta-base-v2_de-en) | 0.8297 | 0.7930 | 0.8341 | 0.8170
111
  [sentence-transformers/stsb-xlm-r-multilingual](https://huggingface.co/sentence-transformers/stsb-xlm-r-multilingual) | 0.8190 | 0.8027 | 0.7674 | 0.8072
112
  [sentence-transformers/paraphrase-xlm-r-multilingual-v1](https://huggingface.co/sentence-transformers/paraphrase-xlm-r-multilingual-v1) | 0.8079 | 0.7844 | 0.8126 | 0.8034
113
  [sentence-transformers/xlm-r-distilroberta-base-paraphrase-v1](https://huggingface.co/sentence-transformers/xlm-r-distilroberta-base-paraphrase-v1) | 0.8079 | 0.7844 | 0.8126 | 0.8034
 
129
  :-----:|:-----:|:-----:|:-----:|:-----:
130
  [PM-AI/sts_paraphrase_xlm-roberta-base_de-en (ours)](https://huggingface.co/PM-AI/sts_paraphrase_xlm-roberta-base_de-en) | 0.8768 <br /> 🏆 | 0.8705 <br /> 🏆 | 0.8402 | 0.8748 <br /> 🏆
131
  [sentence-transformers/paraphrase-multilingual-mpnet-base-v2](https://huggingface.co/sentence-transformers/paraphrase-multilingual-mpnet-base-v2) | 0.8682 | 0.8065 | 0.8430 | 0.8378
132
+ [PM-AI/paraphrase-distilroberta-base-v2_de-en (ours, no fine-tuning)](PM-AI/paraphrase-distilroberta-base-v2_de-en) | 0.8597 | 0.8105 | 0.8399 | 0.8363
133
  [T-Systems-onsite/cross-en-de-roberta-sentence-transformer](https://huggingface.co/T-Systems-onsite/cross-en-de-roberta-sentence-transformer) | 0.8660 | 0.7897 | 0.8097 | 0.8308
134
  [sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2](https://huggingface.co/sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2) | 0.8441 | 0.8059 | 0.8175 | 0.8300
135
  [sentence-transformers/sentence-t5-base](https://huggingface.co/sentence-transformers/sentence-t5-base) | 0.8551 | 0.8063 | 0.8434 | 0.8235