PM-AI
/

sts_paraphrase_xlm-roberta-base_de-en

@@ -2,8 +2,6 @@
 language:
 - de
 - en
-datasets:
-- todo
 pipeline_tag: sentence-similarity
 tags:
 - semantic textual similarity
@@ -14,11 +12,6 @@ tags:
 - sentence-transformer
 - feature-extraction
 - transformers
-task_categories:
-- sentence-similarity
-- feature-extraction
-- text-retrieval
-- other
 ---
 # Model card for PM-AI/sts_paraphrase_xlm-roberta-base_de-en
@@ -44,18 +37,18 @@ In terms of content, the samples are based on rather simple sentences.
 When the TSystems model was published, only the STSb dataset was used for STS training.
 Therefore it is included in our model, but expanded to include SICK and Priya22 semantic textual relatedness:
- - SICK was partly used in STSb, but our independent translation (XYZ) using [DeepL](https://www.deepl.com/) leads to slightly different phrases. This approach allows more examples to be included in the training.
  - The Priya22 semantic textual relatedness dataset published in 2022 was also translated into German via DeepL and added to the training data. Since it does not have a train-test-split, it was created independently at a ratio of 80:20.
 The rating scale of all datasets has been adjusted to STSb with a value range from 0 to 5.
 All training and test data (STSb, Sick, Priya22) were checked for duplicates within and with each other and removed if found.
 Because the test data is prioritized, duplicated entries between test-train are exclusively removed from train split.
-The final used datasets can be viewed here: XYZ.
 ### Training
 Befor fine-tuning for STS we made the English paraphrasing model [paraphrase-distilroberta-base-v1](https://huggingface.co/sentence-transformers/paraphrase-distilroberta-base-v1) usable for German by applying **[Knowledge Distillation](https://arxiv.org/abs/2004.09813)** (_Teacher-Student_ approach).
 The TSystems model used version 1, which is based on 7 different datasets and contains around 24.6 million samples.
 We are using version 2 with 12 datasets and about 83.3 million examples.
-Details for this process here: XYZ
 For fine-tuning we are using SBERT's [training_stsbenchmark_continue_training.py](https://github.com/UKPLab/sentence-transformers/blob/b86eec31cf0a102ad786ba1ff31bfeb4998d3ca5/examples/training/sts/training_stsbenchmark_continue_training.py) training script.
 One thing has been changed in this training script: when a sentence pair consists of identical utterances the score is set to 5.0 (maximum).
@@ -63,10 +56,10 @@ It makes no sense to say identical sentences have a score of 4.8 or 4.9.
 #### Parameterization of training
 - **Script:** [training_stsbenchmark_continue_training.py](https://github.com/UKPLab/sentence-transformers/blob/b86eec31cf0a102ad786ba1ff31bfeb4998d3ca5/examples/training/sts/training_stsbenchmark_continue_training.py)
-- **Datasets:** todo
 - **GPU:** NVIDIA A40 (Driver Version: 515.48.07; CUDA Version: 11.7)
 - **Batch Size:** 32
-- **Base Model:** todo
 - **Loss Function:** Cosine Similarity
 - **Learning Rate:** 2e-5
 - **Epochs:** 3
@@ -88,7 +81,7 @@ The first table shows the evaluation results for **cross-lingual (German-English
 :-----:|:-----:|:-----:|:-----:|:-----:
 [PM-AI/sts_paraphrase_xlm-roberta-base_de-en (ours)](https://huggingface.co/PM-AI/sts_paraphrase_xlm-roberta-base_de-en) | 0.8672 <br /> 🏆 | 0.8639 <br /> 🏆 | 0.8354 <br /> 🏆 | 0.8711 <br /> 🏆
 [T-Systems-onsite/cross-en-de-roberta-sentence-transformer](https://huggingface.co/T-Systems-onsite/cross-en-de-roberta-sentence-transformer) | 0.8525 | 0.7642 | 0.7998 | 0.8216
-[todo (ours, no fine-tuning)]() | 0.8225 | 0.7579 | 0.8255 | 0.8109
 [sentence-transformers/paraphrase-multilingual-mpnet-base-v2](https://huggingface.co/sentence-transformers/paraphrase-multilingual-mpnet-base-v2) | 0.8310 | 0.7529 | 0.8184 | 0.8102
 [sentence-transformers/stsb-xlm-r-multilingual](https://huggingface.co/sentence-transformers/stsb-xlm-r-multilingual) | 0.8194 | 0.7703 | 0.7566 | 0.7998
 [sentence-transformers/paraphrase-xlm-r-multilingual-v1](https://huggingface.co/sentence-transformers/paraphrase-xlm-r-multilingual-v1) | 0.7985 | 0.7217 | 0.7975 | 0.7838
@@ -114,7 +107,7 @@ The second table shows the evaluation results for **German only** based on _Spea
 [T-Systems-onsite/cross-en-de-roberta-sentence-transformer](https://huggingface.co/T-Systems-onsite/cross-en-de-roberta-sentence-transformer) | 0.8547 | 0.8047 | 0.8068 | 0.8327
 [Sahajtomar/German-semantic](https://huggingface.co/Sahajtomar/German-semantic) | 0.8485 | 0.7915 | 0.8139 | 0.8280
 [sentence-transformers/paraphrase-multilingual-mpnet-base-v2](https://huggingface.co/sentence-transformers/paraphrase-multilingual-mpnet-base-v2) | 0.8360 | 0.7941 | 0.8237 | 0.8178
-[todo (ours, no fine-tuning)]() | 0.8297 | 0.7930 | 0.8341 | 0.8170
 [sentence-transformers/stsb-xlm-r-multilingual](https://huggingface.co/sentence-transformers/stsb-xlm-r-multilingual) | 0.8190 | 0.8027 | 0.7674 | 0.8072
 [sentence-transformers/paraphrase-xlm-r-multilingual-v1](https://huggingface.co/sentence-transformers/paraphrase-xlm-r-multilingual-v1) | 0.8079 | 0.7844 | 0.8126 | 0.8034
 [sentence-transformers/xlm-r-distilroberta-base-paraphrase-v1](https://huggingface.co/sentence-transformers/xlm-r-distilroberta-base-paraphrase-v1) | 0.8079 | 0.7844 | 0.8126 | 0.8034
@@ -136,7 +129,7 @@ And last but not least our third table which shows the evaluation results for **
 :-----:|:-----:|:-----:|:-----:|:-----:
 [PM-AI/sts_paraphrase_xlm-roberta-base_de-en (ours)](https://huggingface.co/PM-AI/sts_paraphrase_xlm-roberta-base_de-en) | 0.8768 <br /> 🏆 | 0.8705 <br /> 🏆 | 0.8402 | 0.8748 <br /> 🏆
 [sentence-transformers/paraphrase-multilingual-mpnet-base-v2](https://huggingface.co/sentence-transformers/paraphrase-multilingual-mpnet-base-v2) | 0.8682 | 0.8065 | 0.8430 | 0.8378
-[todo (ours, no fine-tuning)]() | 0.8597 | 0.8105 | 0.8399 | 0.8363
 [T-Systems-onsite/cross-en-de-roberta-sentence-transformer](https://huggingface.co/T-Systems-onsite/cross-en-de-roberta-sentence-transformer) | 0.8660 | 0.7897 | 0.8097 | 0.8308
 [sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2](https://huggingface.co/sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2) | 0.8441 | 0.8059 | 0.8175 | 0.8300
 [sentence-transformers/sentence-t5-base](https://huggingface.co/sentence-transformers/sentence-t5-base) | 0.8551 | 0.8063 | 0.8434 | 0.8235

 language:
 - de
 - en
 pipeline_tag: sentence-similarity
 tags:
 - semantic textual similarity
 - sentence-transformer
 - feature-extraction
 - transformers
 ---
 # Model card for PM-AI/sts_paraphrase_xlm-roberta-base_de-en
 When the TSystems model was published, only the STSb dataset was used for STS training.
 Therefore it is included in our model, but expanded to include SICK and Priya22 semantic textual relatedness:
+ - SICK was partly used in STSb, but our custom translation using [DeepL](https://www.deepl.com/) leads to slightly different phrases. This approach allows more examples to be included in the training.
  - The Priya22 semantic textual relatedness dataset published in 2022 was also translated into German via DeepL and added to the training data. Since it does not have a train-test-split, it was created independently at a ratio of 80:20.
 The rating scale of all datasets has been adjusted to STSb with a value range from 0 to 5.
 All training and test data (STSb, Sick, Priya22) were checked for duplicates within and with each other and removed if found.
 Because the test data is prioritized, duplicated entries between test-train are exclusively removed from train split.
+The final used datasets can be viewed here: [datasets_sts_paraphrase_xlm-roberta-base_de-en](https://gitlab.com/sense.ai.tion-public/datasets_sts_paraphrase_xlm-roberta-base_de-en)
 ### Training
 Befor fine-tuning for STS we made the English paraphrasing model [paraphrase-distilroberta-base-v1](https://huggingface.co/sentence-transformers/paraphrase-distilroberta-base-v1) usable for German by applying **[Knowledge Distillation](https://arxiv.org/abs/2004.09813)** (_Teacher-Student_ approach).
 The TSystems model used version 1, which is based on 7 different datasets and contains around 24.6 million samples.
 We are using version 2 with 12 datasets and about 83.3 million examples.
+Details for this process here: [PM-AI/paraphrase-distilroberta-base-v2_de-en](https://huggingface.co/PM-AI/paraphrase-distilroberta-base-v2_de-en)
 For fine-tuning we are using SBERT's [training_stsbenchmark_continue_training.py](https://github.com/UKPLab/sentence-transformers/blob/b86eec31cf0a102ad786ba1ff31bfeb4998d3ca5/examples/training/sts/training_stsbenchmark_continue_training.py) training script.
 One thing has been changed in this training script: when a sentence pair consists of identical utterances the score is set to 5.0 (maximum).
 #### Parameterization of training
 - **Script:** [training_stsbenchmark_continue_training.py](https://github.com/UKPLab/sentence-transformers/blob/b86eec31cf0a102ad786ba1ff31bfeb4998d3ca5/examples/training/sts/training_stsbenchmark_continue_training.py)
+- **Datasets:** [datasets_sts_paraphrase_xlm-roberta-base_de-en](https://gitlab.com/sense.ai.tion-public/datasets_sts_paraphrase_xlm-roberta-base_de-en)
 - **GPU:** NVIDIA A40 (Driver Version: 515.48.07; CUDA Version: 11.7)
 - **Batch Size:** 32
+- **Base Model:** [PM-AI/paraphrase-distilroberta-base-v2_de-en](PM-AI/paraphrase-distilroberta-base-v2_de-en)
 - **Loss Function:** Cosine Similarity
 - **Learning Rate:** 2e-5
 - **Epochs:** 3
 :-----:|:-----:|:-----:|:-----:|:-----:
 [PM-AI/sts_paraphrase_xlm-roberta-base_de-en (ours)](https://huggingface.co/PM-AI/sts_paraphrase_xlm-roberta-base_de-en) | 0.8672 <br /> 🏆 | 0.8639 <br /> 🏆 | 0.8354 <br /> 🏆 | 0.8711 <br /> 🏆
 [T-Systems-onsite/cross-en-de-roberta-sentence-transformer](https://huggingface.co/T-Systems-onsite/cross-en-de-roberta-sentence-transformer) | 0.8525 | 0.7642 | 0.7998 | 0.8216
+[PM-AI/paraphrase-distilroberta-base-v2_de-en (ours, no fine-tuning)](PM-AI/paraphrase-distilroberta-base-v2_de-en) | 0.8225 | 0.7579 | 0.8255 | 0.8109
 [sentence-transformers/paraphrase-multilingual-mpnet-base-v2](https://huggingface.co/sentence-transformers/paraphrase-multilingual-mpnet-base-v2) | 0.8310 | 0.7529 | 0.8184 | 0.8102
 [sentence-transformers/stsb-xlm-r-multilingual](https://huggingface.co/sentence-transformers/stsb-xlm-r-multilingual) | 0.8194 | 0.7703 | 0.7566 | 0.7998
 [sentence-transformers/paraphrase-xlm-r-multilingual-v1](https://huggingface.co/sentence-transformers/paraphrase-xlm-r-multilingual-v1) | 0.7985 | 0.7217 | 0.7975 | 0.7838
 [T-Systems-onsite/cross-en-de-roberta-sentence-transformer](https://huggingface.co/T-Systems-onsite/cross-en-de-roberta-sentence-transformer) | 0.8547 | 0.8047 | 0.8068 | 0.8327
 [Sahajtomar/German-semantic](https://huggingface.co/Sahajtomar/German-semantic) | 0.8485 | 0.7915 | 0.8139 | 0.8280
 [sentence-transformers/paraphrase-multilingual-mpnet-base-v2](https://huggingface.co/sentence-transformers/paraphrase-multilingual-mpnet-base-v2) | 0.8360 | 0.7941 | 0.8237 | 0.8178
+[PM-AI/paraphrase-distilroberta-base-v2_de-en (ours, no fine-tuning)](PM-AI/paraphrase-distilroberta-base-v2_de-en) | 0.8297 | 0.7930 | 0.8341 | 0.8170
 [sentence-transformers/stsb-xlm-r-multilingual](https://huggingface.co/sentence-transformers/stsb-xlm-r-multilingual) | 0.8190 | 0.8027 | 0.7674 | 0.8072
 [sentence-transformers/paraphrase-xlm-r-multilingual-v1](https://huggingface.co/sentence-transformers/paraphrase-xlm-r-multilingual-v1) | 0.8079 | 0.7844 | 0.8126 | 0.8034
 [sentence-transformers/xlm-r-distilroberta-base-paraphrase-v1](https://huggingface.co/sentence-transformers/xlm-r-distilroberta-base-paraphrase-v1) | 0.8079 | 0.7844 | 0.8126 | 0.8034
 :-----:|:-----:|:-----:|:-----:|:-----:
 [PM-AI/sts_paraphrase_xlm-roberta-base_de-en (ours)](https://huggingface.co/PM-AI/sts_paraphrase_xlm-roberta-base_de-en) | 0.8768 <br /> 🏆 | 0.8705 <br /> 🏆 | 0.8402 | 0.8748 <br /> 🏆
 [sentence-transformers/paraphrase-multilingual-mpnet-base-v2](https://huggingface.co/sentence-transformers/paraphrase-multilingual-mpnet-base-v2) | 0.8682 | 0.8065 | 0.8430 | 0.8378
+[PM-AI/paraphrase-distilroberta-base-v2_de-en (ours, no fine-tuning)](PM-AI/paraphrase-distilroberta-base-v2_de-en) | 0.8597 | 0.8105 | 0.8399 | 0.8363
 [T-Systems-onsite/cross-en-de-roberta-sentence-transformer](https://huggingface.co/T-Systems-onsite/cross-en-de-roberta-sentence-transformer) | 0.8660 | 0.7897 | 0.8097 | 0.8308
 [sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2](https://huggingface.co/sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2) | 0.8441 | 0.8059 | 0.8175 | 0.8300
 [sentence-transformers/sentence-t5-base](https://huggingface.co/sentence-transformers/sentence-t5-base) | 0.8551 | 0.8063 | 0.8434 | 0.8235