RomainDarous commited on
Commit
5078db4
·
verified ·
1 Parent(s): 75a475b

Add new SentenceTransformer model

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 768,
3
+ "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": true,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
2_Dense/config.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"in_features": 768, "out_features": 512, "bias": true, "activation_function": "torch.nn.modules.activation.Tanh"}
2_Dense/model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e369ffa6cabab73d45ccfe57b15306f60df8672a95facbcaf940343382ad8719
3
+ size 1575072
README.md ADDED
@@ -0,0 +1,1019 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - de
4
+ - en
5
+ - es
6
+ - fr
7
+ - it
8
+ - nl
9
+ - pl
10
+ - pt
11
+ - ru
12
+ - zh
13
+ tags:
14
+ - sentence-transformers
15
+ - sentence-similarity
16
+ - feature-extraction
17
+ - generated_from_trainer
18
+ - dataset_size:51741
19
+ - loss:CoSENTLoss
20
+ base_model: RomainDarous/pre_training_original_model
21
+ widget:
22
+ - source_sentence: Starsza para azjatycka pozuje z noworodkiem przy stole obiadowym.
23
+ sentences:
24
+ - Koszykarz ma zamiar zdobyć punkty dla swojej drużyny.
25
+ - Grupa starszych osób pozuje wokół stołu w jadalni.
26
+ - Możliwe, że układ słoneczny taki jak nasz może istnieć poza galaktyką.
27
+ - source_sentence: Englisch arbeitet überall mit Menschen, die Dinge kaufen und verkaufen,
28
+ und in der Gastfreundschaft und im Tourismusgeschäft.
29
+ sentences:
30
+ - Ich bin in Maharashtra (einschließlich Mumbai) und Andhra Pradesh herumgereist,
31
+ und ich hatte kein Problem damit, nur mit Englisch auszukommen.
32
+ - 'Ein griechischsprachiger Sklave (δούλος, doulos) würde seinen Herrn, glaube ich,
33
+ κύριος nennen (translit: kurios; Herr, Herr, Herr, Herr; Vokativform: κύριε).'
34
+ - Das Paar lag auf dem Bett.
35
+ - source_sentence: Si vous vous comprenez et comprenez votre ennemi, vous aurez beaucoup
36
+ plus de chances de gagner n'importe quelle bataille.
37
+ sentences:
38
+ - 'Outre les probabilités de gagner une bataille théorique, cette citation a une
39
+ autre signification : l''importance de connaître/comprendre les autres.'
40
+ - Une femme et un chien se promènent ensemble.
41
+ - Un homme joue de la guitare.
42
+ - source_sentence: Un homme joue de la harpe.
43
+ sentences:
44
+ - Une femme joue de la guitare.
45
+ - une femme a un enfant.
46
+ - Un groupe de personnes est debout et assis sur le sol la nuit.
47
+ - source_sentence: Dois cães a lutar na neve.
48
+ sentences:
49
+ - Dois cães brincam na neve.
50
+ - Pode sempre perguntar, então é a escolha do autor a aceitar ou não.
51
+ - Um gato está a caminhar sobre chão de madeira dura.
52
+ datasets:
53
+ - PhilipMay/stsb_multi_mt
54
+ pipeline_tag: sentence-similarity
55
+ library_name: sentence-transformers
56
+ metrics:
57
+ - pearson_cosine
58
+ - spearman_cosine
59
+ model-index:
60
+ - name: SentenceTransformer based on RomainDarous/pre_training_original_model
61
+ results:
62
+ - task:
63
+ type: semantic-similarity
64
+ name: Semantic Similarity
65
+ dataset:
66
+ name: sts eval
67
+ type: sts-eval
68
+ metrics:
69
+ - type: pearson_cosine
70
+ value: 0.649351613026743
71
+ name: Pearson Cosine
72
+ - type: spearman_cosine
73
+ value: 0.6712113629733555
74
+ name: Spearman Cosine
75
+ - type: pearson_cosine
76
+ value: 0.6648874938903813
77
+ name: Pearson Cosine
78
+ - type: spearman_cosine
79
+ value: 0.6859979455545288
80
+ name: Spearman Cosine
81
+ - type: pearson_cosine
82
+ value: 0.6574990404767099
83
+ name: Pearson Cosine
84
+ - type: spearman_cosine
85
+ value: 0.6819347305734045
86
+ name: Spearman Cosine
87
+ - type: pearson_cosine
88
+ value: 0.6482851200513846
89
+ name: Pearson Cosine
90
+ - type: spearman_cosine
91
+ value: 0.6739057551228634
92
+ name: Spearman Cosine
93
+ - type: pearson_cosine
94
+ value: 0.657747388798702
95
+ name: Pearson Cosine
96
+ - type: spearman_cosine
97
+ value: 0.6797522820481435
98
+ name: Spearman Cosine
99
+ - type: pearson_cosine
100
+ value: 0.580138787555855
101
+ name: Pearson Cosine
102
+ - type: spearman_cosine
103
+ value: 0.6025843591291092
104
+ name: Spearman Cosine
105
+ - type: pearson_cosine
106
+ value: 0.6445711160678915
107
+ name: Pearson Cosine
108
+ - type: spearman_cosine
109
+ value: 0.6738244742184887
110
+ name: Spearman Cosine
111
+ - type: pearson_cosine
112
+ value: 0.6060638359389463
113
+ name: Pearson Cosine
114
+ - type: spearman_cosine
115
+ value: 0.6210827296807453
116
+ name: Spearman Cosine
117
+ - type: pearson_cosine
118
+ value: 0.6672294139281439
119
+ name: Pearson Cosine
120
+ - type: spearman_cosine
121
+ value: 0.6864882079409924
122
+ name: Spearman Cosine
123
+ - task:
124
+ type: semantic-similarity
125
+ name: Semantic Similarity
126
+ dataset:
127
+ name: sts test
128
+ type: sts-test
129
+ metrics:
130
+ - type: pearson_cosine
131
+ value: 0.6279093972489541
132
+ name: Pearson Cosine
133
+ - type: spearman_cosine
134
+ value: 0.6320355986028895
135
+ name: Spearman Cosine
136
+ - type: pearson_cosine
137
+ value: 0.6433522116833627
138
+ name: Pearson Cosine
139
+ - type: spearman_cosine
140
+ value: 0.658000076471118
141
+ name: Spearman Cosine
142
+ - type: pearson_cosine
143
+ value: 0.6271929274305698
144
+ name: Pearson Cosine
145
+ - type: spearman_cosine
146
+ value: 0.6229896619978917
147
+ name: Spearman Cosine
148
+ - type: pearson_cosine
149
+ value: 0.6391062028706688
150
+ name: Pearson Cosine
151
+ - type: spearman_cosine
152
+ value: 0.6417698712729121
153
+ name: Spearman Cosine
154
+ - type: pearson_cosine
155
+ value: 0.622947898324511
156
+ name: Pearson Cosine
157
+ - type: spearman_cosine
158
+ value: 0.6179788172853071
159
+ name: Spearman Cosine
160
+ - type: pearson_cosine
161
+ value: 0.5903164175964553
162
+ name: Pearson Cosine
163
+ - type: spearman_cosine
164
+ value: 0.5887507390354803
165
+ name: Spearman Cosine
166
+ - type: pearson_cosine
167
+ value: 0.640080846863563
168
+ name: Pearson Cosine
169
+ - type: spearman_cosine
170
+ value: 0.6391082728350455
171
+ name: Spearman Cosine
172
+ - type: pearson_cosine
173
+ value: 0.6172821161239198
174
+ name: Pearson Cosine
175
+ - type: spearman_cosine
176
+ value: 0.6180296923884917
177
+ name: Spearman Cosine
178
+ - type: pearson_cosine
179
+ value: 0.6607896399210559
180
+ name: Pearson Cosine
181
+ - type: spearman_cosine
182
+ value: 0.6616750284666137
183
+ name: Spearman Cosine
184
+ ---
185
+
186
+ # SentenceTransformer based on RomainDarous/pre_training_original_model
187
+
188
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [RomainDarous/pre_training_original_model](https://huggingface.co/RomainDarous/pre_training_original_model) on the [multi_stsb_de](https://huggingface.co/datasets/PhilipMay/stsb_multi_mt), [multi_stsb_es](https://huggingface.co/datasets/PhilipMay/stsb_multi_mt), [multi_stsb_fr](https://huggingface.co/datasets/PhilipMay/stsb_multi_mt), [multi_stsb_it](https://huggingface.co/datasets/PhilipMay/stsb_multi_mt), [multi_stsb_nl](https://huggingface.co/datasets/PhilipMay/stsb_multi_mt), [multi_stsb_pl](https://huggingface.co/datasets/PhilipMay/stsb_multi_mt), [multi_stsb_pt](https://huggingface.co/datasets/PhilipMay/stsb_multi_mt), [multi_stsb_ru](https://huggingface.co/datasets/PhilipMay/stsb_multi_mt) and [multi_stsb_zh](https://huggingface.co/datasets/PhilipMay/stsb_multi_mt) datasets. It maps sentences & paragraphs to a 512-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
189
+
190
+ ## Model Details
191
+
192
+ ### Model Description
193
+ - **Model Type:** Sentence Transformer
194
+ - **Base model:** [RomainDarous/pre_training_original_model](https://huggingface.co/RomainDarous/pre_training_original_model) <!-- at revision 880d5ef9d016fb1257687b6b61da19f4978b0f0c -->
195
+ - **Maximum Sequence Length:** 128 tokens
196
+ - **Output Dimensionality:** 512 dimensions
197
+ - **Similarity Function:** Cosine Similarity
198
+ - **Training Datasets:**
199
+ - [multi_stsb_de](https://huggingface.co/datasets/PhilipMay/stsb_multi_mt)
200
+ - [multi_stsb_es](https://huggingface.co/datasets/PhilipMay/stsb_multi_mt)
201
+ - [multi_stsb_fr](https://huggingface.co/datasets/PhilipMay/stsb_multi_mt)
202
+ - [multi_stsb_it](https://huggingface.co/datasets/PhilipMay/stsb_multi_mt)
203
+ - [multi_stsb_nl](https://huggingface.co/datasets/PhilipMay/stsb_multi_mt)
204
+ - [multi_stsb_pl](https://huggingface.co/datasets/PhilipMay/stsb_multi_mt)
205
+ - [multi_stsb_pt](https://huggingface.co/datasets/PhilipMay/stsb_multi_mt)
206
+ - [multi_stsb_ru](https://huggingface.co/datasets/PhilipMay/stsb_multi_mt)
207
+ - [multi_stsb_zh](https://huggingface.co/datasets/PhilipMay/stsb_multi_mt)
208
+ - **Languages:** de, en, es, fr, it, nl, pl, pt, ru, zh
209
+ <!-- - **License:** Unknown -->
210
+
211
+ ### Model Sources
212
+
213
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
214
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
215
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
216
+
217
+ ### Full Model Architecture
218
+
219
+ ```
220
+ SentenceTransformer(
221
+ (0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: DistilBertModel
222
+ (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
223
+ (2): Dense({'in_features': 768, 'out_features': 512, 'bias': True, 'activation_function': 'torch.nn.modules.activation.Tanh'})
224
+ )
225
+ ```
226
+
227
+ ## Usage
228
+
229
+ ### Direct Usage (Sentence Transformers)
230
+
231
+ First install the Sentence Transformers library:
232
+
233
+ ```bash
234
+ pip install -U sentence-transformers
235
+ ```
236
+
237
+ Then you can load this model and run inference.
238
+ ```python
239
+ from sentence_transformers import SentenceTransformer
240
+
241
+ # Download from the 🤗 Hub
242
+ model = SentenceTransformer("RomainDarous/multists_finetuned_original_model")
243
+ # Run inference
244
+ sentences = [
245
+ 'Dois cães a lutar na neve.',
246
+ 'Dois cães brincam na neve.',
247
+ 'Pode sempre perguntar, então é a escolha do autor a aceitar ou não.',
248
+ ]
249
+ embeddings = model.encode(sentences)
250
+ print(embeddings.shape)
251
+ # [3, 512]
252
+
253
+ # Get the similarity scores for the embeddings
254
+ similarities = model.similarity(embeddings, embeddings)
255
+ print(similarities.shape)
256
+ # [3, 3]
257
+ ```
258
+
259
+ <!--
260
+ ### Direct Usage (Transformers)
261
+
262
+ <details><summary>Click to see the direct usage in Transformers</summary>
263
+
264
+ </details>
265
+ -->
266
+
267
+ <!--
268
+ ### Downstream Usage (Sentence Transformers)
269
+
270
+ You can finetune this model on your own dataset.
271
+
272
+ <details><summary>Click to expand</summary>
273
+
274
+ </details>
275
+ -->
276
+
277
+ <!--
278
+ ### Out-of-Scope Use
279
+
280
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
281
+ -->
282
+
283
+ ## Evaluation
284
+
285
+ ### Metrics
286
+
287
+ #### Semantic Similarity
288
+
289
+ * Datasets: `sts-eval`, `sts-test`, `sts-test`, `sts-test`, `sts-test`, `sts-test`, `sts-test`, `sts-test`, `sts-test` and `sts-test`
290
+ * Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
291
+
292
+ | Metric | sts-eval | sts-test |
293
+ |:--------------------|:-----------|:-----------|
294
+ | pearson_cosine | 0.6494 | 0.6608 |
295
+ | **spearman_cosine** | **0.6712** | **0.6617** |
296
+
297
+ #### Semantic Similarity
298
+
299
+ * Dataset: `sts-eval`
300
+ * Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
301
+
302
+ | Metric | Value |
303
+ |:--------------------|:----------|
304
+ | pearson_cosine | 0.6649 |
305
+ | **spearman_cosine** | **0.686** |
306
+
307
+ #### Semantic Similarity
308
+
309
+ * Dataset: `sts-eval`
310
+ * Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
311
+
312
+ | Metric | Value |
313
+ |:--------------------|:-----------|
314
+ | pearson_cosine | 0.6575 |
315
+ | **spearman_cosine** | **0.6819** |
316
+
317
+ #### Semantic Similarity
318
+
319
+ * Dataset: `sts-eval`
320
+ * Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
321
+
322
+ | Metric | Value |
323
+ |:--------------------|:-----------|
324
+ | pearson_cosine | 0.6483 |
325
+ | **spearman_cosine** | **0.6739** |
326
+
327
+ #### Semantic Similarity
328
+
329
+ * Dataset: `sts-eval`
330
+ * Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
331
+
332
+ | Metric | Value |
333
+ |:--------------------|:-----------|
334
+ | pearson_cosine | 0.6577 |
335
+ | **spearman_cosine** | **0.6798** |
336
+
337
+ #### Semantic Similarity
338
+
339
+ * Dataset: `sts-eval`
340
+ * Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
341
+
342
+ | Metric | Value |
343
+ |:--------------------|:-----------|
344
+ | pearson_cosine | 0.5801 |
345
+ | **spearman_cosine** | **0.6026** |
346
+
347
+ #### Semantic Similarity
348
+
349
+ * Dataset: `sts-eval`
350
+ * Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
351
+
352
+ | Metric | Value |
353
+ |:--------------------|:-----------|
354
+ | pearson_cosine | 0.6446 |
355
+ | **spearman_cosine** | **0.6738** |
356
+
357
+ #### Semantic Similarity
358
+
359
+ * Dataset: `sts-eval`
360
+ * Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
361
+
362
+ | Metric | Value |
363
+ |:--------------------|:-----------|
364
+ | pearson_cosine | 0.6061 |
365
+ | **spearman_cosine** | **0.6211** |
366
+
367
+ #### Semantic Similarity
368
+
369
+ * Dataset: `sts-eval`
370
+ * Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
371
+
372
+ | Metric | Value |
373
+ |:--------------------|:-----------|
374
+ | pearson_cosine | 0.6672 |
375
+ | **spearman_cosine** | **0.6865** |
376
+
377
+ <!--
378
+ ## Bias, Risks and Limitations
379
+
380
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
381
+ -->
382
+
383
+ <!--
384
+ ### Recommendations
385
+
386
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
387
+ -->
388
+
389
+ ## Training Details
390
+
391
+ ### Training Datasets
392
+
393
+ #### multi_stsb_de
394
+
395
+ * Dataset: [multi_stsb_de](https://huggingface.co/datasets/PhilipMay/stsb_multi_mt) at [3acaa3d](https://huggingface.co/datasets/PhilipMay/stsb_multi_mt/tree/3acaa3dd8c91649e0b8e627ffad891f059e47c8c)
396
+ * Size: 5,749 training samples
397
+ * Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>score</code>
398
+ * Approximate statistics based on the first 1000 samples:
399
+ | | sentence1 | sentence2 | score |
400
+ |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:---------------------------------------------------------------|
401
+ | type | string | string | float |
402
+ | details | <ul><li>min: 5 tokens</li><li>mean: 12.05 tokens</li><li>max: 37 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 12.01 tokens</li><li>max: 37 tokens</li></ul> | <ul><li>min: 0.0</li><li>mean: 0.45</li><li>max: 1.0</li></ul> |
403
+ * Samples:
404
+ | sentence1 | sentence2 | score |
405
+ |:---------------------------------------------------------------|:--------------------------------------------------------------------------|:--------------------------------|
406
+ | <code>Ein Flugzeug hebt gerade ab.</code> | <code>Ein Flugzeug hebt gerade ab.</code> | <code>1.0</code> |
407
+ | <code>Ein Mann spielt eine große Flöte.</code> | <code>Ein Mann spielt eine Flöte.</code> | <code>0.7599999904632568</code> |
408
+ | <code>Ein Mann streicht geriebenen Käse auf eine Pizza.</code> | <code>Ein Mann streicht geriebenen Käse auf eine ungekochte Pizza.</code> | <code>0.7599999904632568</code> |
409
+ * Loss: [<code>CoSENTLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosentloss) with these parameters:
410
+ ```json
411
+ {
412
+ "scale": 20.0,
413
+ "similarity_fct": "pairwise_cos_sim"
414
+ }
415
+ ```
416
+
417
+ #### multi_stsb_es
418
+
419
+ * Dataset: [multi_stsb_es](https://huggingface.co/datasets/PhilipMay/stsb_multi_mt) at [3acaa3d](https://huggingface.co/datasets/PhilipMay/stsb_multi_mt/tree/3acaa3dd8c91649e0b8e627ffad891f059e47c8c)
420
+ * Size: 5,749 training samples
421
+ * Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>score</code>
422
+ * Approximate statistics based on the first 1000 samples:
423
+ | | sentence1 | sentence2 | score |
424
+ |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:---------------------------------------------------------------|
425
+ | type | string | string | float |
426
+ | details | <ul><li>min: 7 tokens</li><li>mean: 12.28 tokens</li><li>max: 36 tokens</li></ul> | <ul><li>min: 7 tokens</li><li>mean: 12.14 tokens</li><li>max: 31 tokens</li></ul> | <ul><li>min: 0.0</li><li>mean: 0.45</li><li>max: 1.0</li></ul> |
427
+ * Samples:
428
+ | sentence1 | sentence2 | score |
429
+ |:----------------------------------------------------------------|:----------------------------------------------------------------------|:--------------------------------|
430
+ | <code>Un avión está despegando.</code> | <code>Un avión está despegando.</code> | <code>1.0</code> |
431
+ | <code>Un hombre está tocando una gran flauta.</code> | <code>Un hombre está tocando una flauta.</code> | <code>0.7599999904632568</code> |
432
+ | <code>Un hombre está untando queso rallado en una pizza.</code> | <code>Un hombre está untando queso rallado en una pizza cruda.</code> | <code>0.7599999904632568</code> |
433
+ * Loss: [<code>CoSENTLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosentloss) with these parameters:
434
+ ```json
435
+ {
436
+ "scale": 20.0,
437
+ "similarity_fct": "pairwise_cos_sim"
438
+ }
439
+ ```
440
+
441
+ #### multi_stsb_fr
442
+
443
+ * Dataset: [multi_stsb_fr](https://huggingface.co/datasets/PhilipMay/stsb_multi_mt) at [3acaa3d](https://huggingface.co/datasets/PhilipMay/stsb_multi_mt/tree/3acaa3dd8c91649e0b8e627ffad891f059e47c8c)
444
+ * Size: 5,749 training samples
445
+ * Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>score</code>
446
+ * Approximate statistics based on the first 1000 samples:
447
+ | | sentence1 | sentence2 | score |
448
+ |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:---------------------------------------------------------------|
449
+ | type | string | string | float |
450
+ | details | <ul><li>min: 6 tokens</li><li>mean: 12.47 tokens</li><li>max: 38 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 12.37 tokens</li><li>max: 31 tokens</li></ul> | <ul><li>min: 0.0</li><li>mean: 0.45</li><li>max: 1.0</li></ul> |
451
+ * Samples:
452
+ | sentence1 | sentence2 | score |
453
+ |:-----------------------------------------------------------|:---------------------------------------------------------------------|:--------------------------------|
454
+ | <code>Un avion est en train de décoller.</code> | <code>Un avion est en train de décoller.</code> | <code>1.0</code> |
455
+ | <code>Un homme joue d'une grande flûte.</code> | <code>Un homme joue de la flûte.</code> | <code>0.7599999904632568</code> |
456
+ | <code>Un homme étale du fromage râpé sur une pizza.</code> | <code>Un homme étale du fromage râpé sur une pizza non cuite.</code> | <code>0.7599999904632568</code> |
457
+ * Loss: [<code>CoSENTLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosentloss) with these parameters:
458
+ ```json
459
+ {
460
+ "scale": 20.0,
461
+ "similarity_fct": "pairwise_cos_sim"
462
+ }
463
+ ```
464
+
465
+ #### multi_stsb_it
466
+
467
+ * Dataset: [multi_stsb_it](https://huggingface.co/datasets/PhilipMay/stsb_multi_mt) at [3acaa3d](https://huggingface.co/datasets/PhilipMay/stsb_multi_mt/tree/3acaa3dd8c91649e0b8e627ffad891f059e47c8c)
468
+ * Size: 5,749 training samples
469
+ * Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>score</code>
470
+ * Approximate statistics based on the first 1000 samples:
471
+ | | sentence1 | sentence2 | score |
472
+ |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:---------------------------------------------------------------|
473
+ | type | string | string | float |
474
+ | details | <ul><li>min: 7 tokens</li><li>mean: 12.92 tokens</li><li>max: 36 tokens</li></ul> | <ul><li>min: 7 tokens</li><li>mean: 12.81 tokens</li><li>max: 30 tokens</li></ul> | <ul><li>min: 0.0</li><li>mean: 0.45</li><li>max: 1.0</li></ul> |
475
+ * Samples:
476
+ | sentence1 | sentence2 | score |
477
+ |:--------------------------------------------------------------------------|:------------------------------------------------------------------------------------|:--------------------------------|
478
+ | <code>Un aereo sta decollando.</code> | <code>Un aereo sta decollando.</code> | <code>1.0</code> |
479
+ | <code>Un uomo sta suonando un grande flauto.</code> | <code>Un uomo sta suonando un flauto.</code> | <code>0.7599999904632568</code> |
480
+ | <code>Un uomo sta spalmando del formaggio a pezzetti su una pizza.</code> | <code>Un uomo sta spalmando del formaggio a pezzetti su una pizza non cotta.</code> | <code>0.7599999904632568</code> |
481
+ * Loss: [<code>CoSENTLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosentloss) with these parameters:
482
+ ```json
483
+ {
484
+ "scale": 20.0,
485
+ "similarity_fct": "pairwise_cos_sim"
486
+ }
487
+ ```
488
+
489
+ #### multi_stsb_nl
490
+
491
+ * Dataset: [multi_stsb_nl](https://huggingface.co/datasets/PhilipMay/stsb_multi_mt) at [3acaa3d](https://huggingface.co/datasets/PhilipMay/stsb_multi_mt/tree/3acaa3dd8c91649e0b8e627ffad891f059e47c8c)
492
+ * Size: 5,749 training samples
493
+ * Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>score</code>
494
+ * Approximate statistics based on the first 1000 samples:
495
+ | | sentence1 | sentence2 | score |
496
+ |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:---------------------------------------------------------------|
497
+ | type | string | string | float |
498
+ | details | <ul><li>min: 6 tokens</li><li>mean: 12.12 tokens</li><li>max: 33 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 12.04 tokens</li><li>max: 30 tokens</li></ul> | <ul><li>min: 0.0</li><li>mean: 0.45</li><li>max: 1.0</li></ul> |
499
+ * Samples:
500
+ | sentence1 | sentence2 | score |
501
+ |:--------------------------------------------------------|:--------------------------------------------------------------------|:--------------------------------|
502
+ | <code>Er gaat een vliegtuig opstijgen.</code> | <code>Er gaat een vliegtuig opstijgen.</code> | <code>1.0</code> |
503
+ | <code>Een man speelt een grote fluit.</code> | <code>Een man speelt fluit.</code> | <code>0.7599999904632568</code> |
504
+ | <code>Een man smeert geraspte kaas op een pizza.</code> | <code>Een man strooit geraspte kaas op een ongekookte pizza.</code> | <code>0.7599999904632568</code> |
505
+ * Loss: [<code>CoSENTLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosentloss) with these parameters:
506
+ ```json
507
+ {
508
+ "scale": 20.0,
509
+ "similarity_fct": "pairwise_cos_sim"
510
+ }
511
+ ```
512
+
513
+ #### multi_stsb_pl
514
+
515
+ * Dataset: [multi_stsb_pl](https://huggingface.co/datasets/PhilipMay/stsb_multi_mt) at [3acaa3d](https://huggingface.co/datasets/PhilipMay/stsb_multi_mt/tree/3acaa3dd8c91649e0b8e627ffad891f059e47c8c)
516
+ * Size: 5,749 training samples
517
+ * Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>score</code>
518
+ * Approximate statistics based on the first 1000 samples:
519
+ | | sentence1 | sentence2 | score |
520
+ |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:---------------------------------------------------------------|
521
+ | type | string | string | float |
522
+ | details | <ul><li>min: 6 tokens</li><li>mean: 13.24 tokens</li><li>max: 46 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 13.08 tokens</li><li>max: 34 tokens</li></ul> | <ul><li>min: 0.0</li><li>mean: 0.45</li><li>max: 1.0</li></ul> |
523
+ * Samples:
524
+ | sentence1 | sentence2 | score |
525
+ |:-----------------------------------------------------------|:------------------------------------------------------------------------|:--------------------------------|
526
+ | <code>Samolot wystartował.</code> | <code>Samolot wystartował.</code> | <code>1.0</code> |
527
+ | <code>Człowiek gra na dużym flecie.</code> | <code>Człowiek gra na flecie.</code> | <code>0.7599999904632568</code> |
528
+ | <code>Mężczyzna rozsiewa na pizzy rozdrobniony ser.</code> | <code>Mężczyzna rozsiewa rozdrobniony ser na niegotowanej pizzy.</code> | <code>0.7599999904632568</code> |
529
+ * Loss: [<code>CoSENTLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosentloss) with these parameters:
530
+ ```json
531
+ {
532
+ "scale": 20.0,
533
+ "similarity_fct": "pairwise_cos_sim"
534
+ }
535
+ ```
536
+
537
+ #### multi_stsb_pt
538
+
539
+ * Dataset: [multi_stsb_pt](https://huggingface.co/datasets/PhilipMay/stsb_multi_mt) at [3acaa3d](https://huggingface.co/datasets/PhilipMay/stsb_multi_mt/tree/3acaa3dd8c91649e0b8e627ffad891f059e47c8c)
540
+ * Size: 5,749 training samples
541
+ * Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>score</code>
542
+ * Approximate statistics based on the first 1000 samples:
543
+ | | sentence1 | sentence2 | score |
544
+ |:--------|:---------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:---------------------------------------------------------------|
545
+ | type | string | string | float |
546
+ | details | <ul><li>min: 7 tokens</li><li>mean: 13.0 tokens</li><li>max: 37 tokens</li></ul> | <ul><li>min: 7 tokens</li><li>mean: 12.99 tokens</li><li>max: 34 tokens</li></ul> | <ul><li>min: 0.0</li><li>mean: 0.45</li><li>max: 1.0</li></ul> |
547
+ * Samples:
548
+ | sentence1 | sentence2 | score |
549
+ |:------------------------------------------------------------------|:----------------------------------------------------------------------------------|:--------------------------------|
550
+ | <code>Um avião está a descolar.</code> | <code>Um avião aéreo está a descolar.</code> | <code>1.0</code> |
551
+ | <code>Um homem está a tocar uma grande flauta.</code> | <code>Um homem está a tocar uma flauta.</code> | <code>0.7599999904632568</code> |
552
+ | <code>Um homem está a espalhar queijo desfiado numa pizza.</code> | <code>Um homem está a espalhar queijo desfiado sobre uma pizza não cozida.</code> | <code>0.7599999904632568</code> |
553
+ * Loss: [<code>CoSENTLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosentloss) with these parameters:
554
+ ```json
555
+ {
556
+ "scale": 20.0,
557
+ "similarity_fct": "pairwise_cos_sim"
558
+ }
559
+ ```
560
+
561
+ #### multi_stsb_ru
562
+
563
+ * Dataset: [multi_stsb_ru](https://huggingface.co/datasets/PhilipMay/stsb_multi_mt) at [3acaa3d](https://huggingface.co/datasets/PhilipMay/stsb_multi_mt/tree/3acaa3dd8c91649e0b8e627ffad891f059e47c8c)
564
+ * Size: 5,749 training samples
565
+ * Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>score</code>
566
+ * Approximate statistics based on the first 1000 samples:
567
+ | | sentence1 | sentence2 | score |
568
+ |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:---------------------------------------------------------------|
569
+ | type | string | string | float |
570
+ | details | <ul><li>min: 5 tokens</li><li>mean: 12.66 tokens</li><li>max: 47 tokens</li></ul> | <ul><li>min: 5 tokens</li><li>mean: 12.67 tokens</li><li>max: 36 tokens</li></ul> | <ul><li>min: 0.0</li><li>mean: 0.45</li><li>max: 1.0</li></ul> |
571
+ * Samples:
572
+ | sentence1 | sentence2 | score |
573
+ |:------------------------------------------------|:---------------------------------------------------------------------|:--------------------------------|
574
+ | <code>Самолет взлетает.</code> | <code>Взлетает самолет.</code> | <code>1.0</code> |
575
+ | <code>Человек играет на большой флейте.</code> | <code>Человек играет на флейте.</code> | <code>0.7599999904632568</code> |
576
+ | <code>Мужчина разбрасывает сыр на пиццу.</code> | <code>Мужчина разбрасывает измельченный сыр на вареную пиццу.</code> | <code>0.7599999904632568</code> |
577
+ * Loss: [<code>CoSENTLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosentloss) with these parameters:
578
+ ```json
579
+ {
580
+ "scale": 20.0,
581
+ "similarity_fct": "pairwise_cos_sim"
582
+ }
583
+ ```
584
+
585
+ #### multi_stsb_zh
586
+
587
+ * Dataset: [multi_stsb_zh](https://huggingface.co/datasets/PhilipMay/stsb_multi_mt) at [3acaa3d](https://huggingface.co/datasets/PhilipMay/stsb_multi_mt/tree/3acaa3dd8c91649e0b8e627ffad891f059e47c8c)
588
+ * Size: 5,749 training samples
589
+ * Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>score</code>
590
+ * Approximate statistics based on the first 1000 samples:
591
+ | | sentence1 | sentence2 | score |
592
+ |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:---------------------------------------------------------------|
593
+ | type | string | string | float |
594
+ | details | <ul><li>min: 7 tokens</li><li>mean: 12.55 tokens</li><li>max: 37 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 12.73 tokens</li><li>max: 28 tokens</li></ul> | <ul><li>min: 0.0</li><li>mean: 0.45</li><li>max: 1.0</li></ul> |
595
+ * Samples:
596
+ | sentence1 | sentence2 | score |
597
+ |:------------------------------|:----------------------------------|:--------------------------------|
598
+ | <code>一架飞机正在起飞。</code> | <code>一架飞机正在起飞。</code> | <code>1.0</code> |
599
+ | <code>一个男人正在吹一支大笛子。</code> | <code>一个人在吹笛子。</code> | <code>0.7599999904632568</code> |
600
+ | <code>一名男子正在比萨饼上涂抹奶酪丝。</code> | <code>一名男子正在将奶酪丝涂抹在未熟的披萨上。</code> | <code>0.7599999904632568</code> |
601
+ * Loss: [<code>CoSENTLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosentloss) with these parameters:
602
+ ```json
603
+ {
604
+ "scale": 20.0,
605
+ "similarity_fct": "pairwise_cos_sim"
606
+ }
607
+ ```
608
+
609
+ ### Evaluation Datasets
610
+
611
+ #### multi_stsb_de
612
+
613
+ * Dataset: [multi_stsb_de](https://huggingface.co/datasets/PhilipMay/stsb_multi_mt) at [3acaa3d](https://huggingface.co/datasets/PhilipMay/stsb_multi_mt/tree/3acaa3dd8c91649e0b8e627ffad891f059e47c8c)
614
+ * Size: 1,500 evaluation samples
615
+ * Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>score</code>
616
+ * Approximate statistics based on the first 1000 samples:
617
+ | | sentence1 | sentence2 | score |
618
+ |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:---------------------------------------------------------------|
619
+ | type | string | string | float |
620
+ | details | <ul><li>min: 6 tokens</li><li>mean: 18.96 tokens</li><li>max: 51 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 19.01 tokens</li><li>max: 55 tokens</li></ul> | <ul><li>min: 0.0</li><li>mean: 0.42</li><li>max: 1.0</li></ul> |
621
+ * Samples:
622
+ | sentence1 | sentence2 | score |
623
+ |:-------------------------------------------------------------|:-----------------------------------------------------------|:-------------------------------|
624
+ | <code>Ein Mann mit einem Schutzhelm tanzt.</code> | <code>Ein Mann mit einem Schutzhelm tanzt.</code> | <code>1.0</code> |
625
+ | <code>Ein kleines Kind reitet auf einem Pferd.</code> | <code>Ein Kind reitet auf einem Pferd.</code> | <code>0.949999988079071</code> |
626
+ | <code>Ein Mann verfüttert eine Maus an eine Schlange.</code> | <code>Der Mann füttert die Schlange mit einer Maus.</code> | <code>1.0</code> |
627
+ * Loss: [<code>CoSENTLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosentloss) with these parameters:
628
+ ```json
629
+ {
630
+ "scale": 20.0,
631
+ "similarity_fct": "pairwise_cos_sim"
632
+ }
633
+ ```
634
+
635
+ #### multi_stsb_es
636
+
637
+ * Dataset: [multi_stsb_es](https://huggingface.co/datasets/PhilipMay/stsb_multi_mt) at [3acaa3d](https://huggingface.co/datasets/PhilipMay/stsb_multi_mt/tree/3acaa3dd8c91649e0b8e627ffad891f059e47c8c)
638
+ * Size: 1,500 evaluation samples
639
+ * Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>score</code>
640
+ * Approximate statistics based on the first 1000 samples:
641
+ | | sentence1 | sentence2 | score |
642
+ |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:---------------------------------------------------------------|
643
+ | type | string | string | float |
644
+ | details | <ul><li>min: 7 tokens</li><li>mean: 18.41 tokens</li><li>max: 45 tokens</li></ul> | <ul><li>min: 7 tokens</li><li>mean: 18.24 tokens</li><li>max: 51 tokens</li></ul> | <ul><li>min: 0.0</li><li>mean: 0.42</li><li>max: 1.0</li></ul> |
645
+ * Samples:
646
+ | sentence1 | sentence2 | score |
647
+ |:----------------------------------------------------------------------|:---------------------------------------------------------------------|:-------------------------------|
648
+ | <code>Un hombre con un casco está bailando.</code> | <code>Un hombre con un casco está bailando.</code> | <code>1.0</code> |
649
+ | <code>Un niño pequeño está montando a caballo.</code> | <code>Un niño está montando a caballo.</code> | <code>0.949999988079071</code> |
650
+ | <code>Un hombre está alimentando a una serpiente con un ratón.</code> | <code>El hombre está alimentando a la serpiente con un ratón.</code> | <code>1.0</code> |
651
+ * Loss: [<code>CoSENTLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosentloss) with these parameters:
652
+ ```json
653
+ {
654
+ "scale": 20.0,
655
+ "similarity_fct": "pairwise_cos_sim"
656
+ }
657
+ ```
658
+
659
+ #### multi_stsb_fr
660
+
661
+ * Dataset: [multi_stsb_fr](https://huggingface.co/datasets/PhilipMay/stsb_multi_mt) at [3acaa3d](https://huggingface.co/datasets/PhilipMay/stsb_multi_mt/tree/3acaa3dd8c91649e0b8e627ffad891f059e47c8c)
662
+ * Size: 1,500 evaluation samples
663
+ * Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>score</code>
664
+ * Approximate statistics based on the first 1000 samples:
665
+ | | sentence1 | sentence2 | score |
666
+ |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:---------------------------------------------------------------|
667
+ | type | string | string | float |
668
+ | details | <ul><li>min: 6 tokens</li><li>mean: 19.77 tokens</li><li>max: 50 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 19.62 tokens</li><li>max: 56 tokens</li></ul> | <ul><li>min: 0.0</li><li>mean: 0.42</li><li>max: 1.0</li></ul> |
669
+ * Samples:
670
+ | sentence1 | sentence2 | score |
671
+ |:-------------------------------------------------------------------------|:----------------------------------------------------------------------------|:-------------------------------|
672
+ | <code>Un homme avec un casque de sécurité est en train de danser.</code> | <code>Un homme portant un casque de sécurité est en train de danser.</code> | <code>1.0</code> |
673
+ | <code>Un jeune enfant monte à cheval.</code> | <code>Un enfant monte à cheval.</code> | <code>0.949999988079071</code> |
674
+ | <code>Un homme donne une souris à un serpent.</code> | <code>L'homme donne une souris au serpent.</code> | <code>1.0</code> |
675
+ * Loss: [<code>CoSENTLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosentloss) with these parameters:
676
+ ```json
677
+ {
678
+ "scale": 20.0,
679
+ "similarity_fct": "pairwise_cos_sim"
680
+ }
681
+ ```
682
+
683
+ #### multi_stsb_it
684
+
685
+ * Dataset: [multi_stsb_it](https://huggingface.co/datasets/PhilipMay/stsb_multi_mt) at [3acaa3d](https://huggingface.co/datasets/PhilipMay/stsb_multi_mt/tree/3acaa3dd8c91649e0b8e627ffad891f059e47c8c)
686
+ * Size: 1,500 evaluation samples
687
+ * Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>score</code>
688
+ * Approximate statistics based on the first 1000 samples:
689
+ | | sentence1 | sentence2 | score |
690
+ |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:---------------------------------------------------------------|
691
+ | type | string | string | float |
692
+ | details | <ul><li>min: 6 tokens</li><li>mean: 19.05 tokens</li><li>max: 48 tokens</li></ul> | <ul><li>min: 7 tokens</li><li>mean: 19.03 tokens</li><li>max: 56 tokens</li></ul> | <ul><li>min: 0.0</li><li>mean: 0.42</li><li>max: 1.0</li></ul> |
693
+ * Samples:
694
+ | sentence1 | sentence2 | score |
695
+ |:------------------------------------------------------------------|:---------------------------------------------------------------|:-------------------------------|
696
+ | <code>Un uomo con l'elmetto sta ballando.</code> | <code>Un uomo che indossa un elmetto sta ballando.</code> | <code>1.0</code> |
697
+ | <code>Un bambino piccolo sta cavalcando un cavallo.</code> | <code>Un bambino sta cavalcando un cavallo.</code> | <code>0.949999988079071</code> |
698
+ | <code>Un uomo sta dando da mangiare un topo a un serpente.</code> | <code>L'uomo sta dando da mangiare un topo al serpente.</code> | <code>1.0</code> |
699
+ * Loss: [<code>CoSENTLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosentloss) with these parameters:
700
+ ```json
701
+ {
702
+ "scale": 20.0,
703
+ "similarity_fct": "pairwise_cos_sim"
704
+ }
705
+ ```
706
+
707
+ #### multi_stsb_nl
708
+
709
+ * Dataset: [multi_stsb_nl](https://huggingface.co/datasets/PhilipMay/stsb_multi_mt) at [3acaa3d](https://huggingface.co/datasets/PhilipMay/stsb_multi_mt/tree/3acaa3dd8c91649e0b8e627ffad891f059e47c8c)
710
+ * Size: 1,500 evaluation samples
711
+ * Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>score</code>
712
+ * Approximate statistics based on the first 1000 samples:
713
+ | | sentence1 | sentence2 | score |
714
+ |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:---------------------------------------------------------------|
715
+ | type | string | string | float |
716
+ | details | <ul><li>min: 6 tokens</li><li>mean: 19.12 tokens</li><li>max: 49 tokens</li></ul> | <ul><li>min: 7 tokens</li><li>mean: 18.95 tokens</li><li>max: 50 tokens</li></ul> | <ul><li>min: 0.0</li><li>mean: 0.42</li><li>max: 1.0</li></ul> |
717
+ * Samples:
718
+ | sentence1 | sentence2 | score |
719
+ |:-----------------------------------------------------|:-----------------------------------------------------|:-------------------------------|
720
+ | <code>Een man met een helm is aan het dansen.</code> | <code>Een man met een helm is aan het dansen.</code> | <code>1.0</code> |
721
+ | <code>Een jong kind rijdt op een paard.</code> | <code>Een kind rijdt op een paard.</code> | <code>0.949999988079071</code> |
722
+ | <code>Een man voedt een muis aan een slang.</code> | <code>De man voert een muis aan de slang.</code> | <code>1.0</code> |
723
+ * Loss: [<code>CoSENTLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosentloss) with these parameters:
724
+ ```json
725
+ {
726
+ "scale": 20.0,
727
+ "similarity_fct": "pairwise_cos_sim"
728
+ }
729
+ ```
730
+
731
+ #### multi_stsb_pl
732
+
733
+ * Dataset: [multi_stsb_pl](https://huggingface.co/datasets/PhilipMay/stsb_multi_mt) at [3acaa3d](https://huggingface.co/datasets/PhilipMay/stsb_multi_mt/tree/3acaa3dd8c91649e0b8e627ffad891f059e47c8c)
734
+ * Size: 1,500 evaluation samples
735
+ * Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>score</code>
736
+ * Approximate statistics based on the first 1000 samples:
737
+ | | sentence1 | sentence2 | score |
738
+ |:--------|:---------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:---------------------------------------------------------------|
739
+ | type | string | string | float |
740
+ | details | <ul><li>min: 7 tokens</li><li>mean: 21.6 tokens</li><li>max: 58 tokens</li></ul> | <ul><li>min: 7 tokens</li><li>mean: 21.47 tokens</li><li>max: 56 tokens</li></ul> | <ul><li>min: 0.0</li><li>mean: 0.42</li><li>max: 1.0</li></ul> |
741
+ * Samples:
742
+ | sentence1 | sentence2 | score |
743
+ |:---------------------------------------------------|:---------------------------------------------------|:-------------------------------|
744
+ | <code>Tańczy mężczyzna w twardym kapeluszu.</code> | <code>Tańczy mężczyzna w twardym kapeluszu.</code> | <code>1.0</code> |
745
+ | <code>Małe dziecko jedzie na koniu.</code> | <code>Dziecko jedzie na koniu.</code> | <code>0.949999988079071</code> |
746
+ | <code>Człowiek karmi węża myszką.</code> | <code>Ten człowiek karmi węża myszką.</code> | <code>1.0</code> |
747
+ * Loss: [<code>CoSENTLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosentloss) with these parameters:
748
+ ```json
749
+ {
750
+ "scale": 20.0,
751
+ "similarity_fct": "pairwise_cos_sim"
752
+ }
753
+ ```
754
+
755
+ #### multi_stsb_pt
756
+
757
+ * Dataset: [multi_stsb_pt](https://huggingface.co/datasets/PhilipMay/stsb_multi_mt) at [3acaa3d](https://huggingface.co/datasets/PhilipMay/stsb_multi_mt/tree/3acaa3dd8c91649e0b8e627ffad891f059e47c8c)
758
+ * Size: 1,500 evaluation samples
759
+ * Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>score</code>
760
+ * Approximate statistics based on the first 1000 samples:
761
+ | | sentence1 | sentence2 | score |
762
+ |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:---------------------------------------------------------------|
763
+ | type | string | string | float |
764
+ | details | <ul><li>min: 7 tokens</li><li>mean: 19.26 tokens</li><li>max: 48 tokens</li></ul> | <ul><li>min: 7 tokens</li><li>mean: 19.08 tokens</li><li>max: 50 tokens</li></ul> | <ul><li>min: 0.0</li><li>mean: 0.42</li><li>max: 1.0</li></ul> |
765
+ * Samples:
766
+ | sentence1 | sentence2 | score |
767
+ |:------------------------------------------------------------|:-----------------------------------------------------------|:-------------------------------|
768
+ | <code>Um homem de chapéu duro está a dançar.</code> | <code>Um homem com um capacete está a dançar.</code> | <code>1.0</code> |
769
+ | <code>Uma criança pequena está a montar a cavalo.</code> | <code>Uma criança está a montar a cavalo.</code> | <code>0.949999988079071</code> |
770
+ | <code>Um homem está a alimentar um rato a uma cobra.</code> | <code>O homem está a alimentar a cobra com um rato.</code> | <code>1.0</code> |
771
+ * Loss: [<code>CoSENTLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosentloss) with these parameters:
772
+ ```json
773
+ {
774
+ "scale": 20.0,
775
+ "similarity_fct": "pairwise_cos_sim"
776
+ }
777
+ ```
778
+
779
+ #### multi_stsb_ru
780
+
781
+ * Dataset: [multi_stsb_ru](https://huggingface.co/datasets/PhilipMay/stsb_multi_mt) at [3acaa3d](https://huggingface.co/datasets/PhilipMay/stsb_multi_mt/tree/3acaa3dd8c91649e0b8e627ffad891f059e47c8c)
782
+ * Size: 1,500 evaluation samples
783
+ * Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>score</code>
784
+ * Approximate statistics based on the first 1000 samples:
785
+ | | sentence1 | sentence2 | score |
786
+ |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:---------------------------------------------------------------|
787
+ | type | string | string | float |
788
+ | details | <ul><li>min: 6 tokens</li><li>mean: 20.91 tokens</li><li>max: 55 tokens</li></ul> | <ul><li>min: 5 tokens</li><li>mean: 20.95 tokens</li><li>max: 65 tokens</li></ul> | <ul><li>min: 0.0</li><li>mean: 0.42</li><li>max: 1.0</li></ul> |
789
+ * Samples:
790
+ | sentence1 | sentence2 | score |
791
+ |:------------------------------------------------------|:----------------------------------------------|:-------------------------------|
792
+ | <code>Человек в твердой шляпе танцует.</code> | <code>Мужчина в твердой шляпе танцует.</code> | <code>1.0</code> |
793
+ | <code>Маленький ребенок едет верхом на лошади.</code> | <code>Ребенок едет на лошади.</code> | <code>0.949999988079071</code> |
794
+ | <code>Мужчина кормит мышь змее.</code> | <code>Человек кормит змею мышью.</code> | <code>1.0</code> |
795
+ * Loss: [<code>CoSENTLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosentloss) with these parameters:
796
+ ```json
797
+ {
798
+ "scale": 20.0,
799
+ "similarity_fct": "pairwise_cos_sim"
800
+ }
801
+ ```
802
+
803
+ #### multi_stsb_zh
804
+
805
+ * Dataset: [multi_stsb_zh](https://huggingface.co/datasets/PhilipMay/stsb_multi_mt) at [3acaa3d](https://huggingface.co/datasets/PhilipMay/stsb_multi_mt/tree/3acaa3dd8c91649e0b8e627ffad891f059e47c8c)
806
+ * Size: 1,500 evaluation samples
807
+ * Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>score</code>
808
+ * Approximate statistics based on the first 1000 samples:
809
+ | | sentence1 | sentence2 | score |
810
+ |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:---------------------------------------------------------------|
811
+ | type | string | string | float |
812
+ | details | <ul><li>min: 5 tokens</li><li>mean: 19.81 tokens</li><li>max: 53 tokens</li></ul> | <ul><li>min: 5 tokens</li><li>mean: 19.67 tokens</li><li>max: 56 tokens</li></ul> | <ul><li>min: 0.0</li><li>mean: 0.42</li><li>max: 1.0</li></ul> |
813
+ * Samples:
814
+ | sentence1 | sentence2 | score |
815
+ |:---------------------------|:--------------------------|:-------------------------------|
816
+ | <code>一个戴着硬帽子的人在跳舞。</code> | <code>一个戴着硬帽的人在跳舞。</code> | <code>1.0</code> |
817
+ | <code>一个小孩子在骑马。</code> | <code>一个孩子在骑马。</code> | <code>0.949999988079071</code> |
818
+ | <code>一个人正在用老鼠喂蛇。</code> | <code>那人正在给蛇喂老鼠。</code> | <code>1.0</code> |
819
+ * Loss: [<code>CoSENTLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosentloss) with these parameters:
820
+ ```json
821
+ {
822
+ "scale": 20.0,
823
+ "similarity_fct": "pairwise_cos_sim"
824
+ }
825
+ ```
826
+
827
+ ### Training Hyperparameters
828
+ #### Non-Default Hyperparameters
829
+
830
+ - `eval_strategy`: steps
831
+ - `per_device_train_batch_size`: 16
832
+ - `per_device_eval_batch_size`: 16
833
+ - `num_train_epochs`: 4
834
+ - `warmup_ratio`: 0.1
835
+
836
+ #### All Hyperparameters
837
+ <details><summary>Click to expand</summary>
838
+
839
+ - `overwrite_output_dir`: False
840
+ - `do_predict`: False
841
+ - `eval_strategy`: steps
842
+ - `prediction_loss_only`: True
843
+ - `per_device_train_batch_size`: 16
844
+ - `per_device_eval_batch_size`: 16
845
+ - `per_gpu_train_batch_size`: None
846
+ - `per_gpu_eval_batch_size`: None
847
+ - `gradient_accumulation_steps`: 1
848
+ - `eval_accumulation_steps`: None
849
+ - `torch_empty_cache_steps`: None
850
+ - `learning_rate`: 5e-05
851
+ - `weight_decay`: 0.0
852
+ - `adam_beta1`: 0.9
853
+ - `adam_beta2`: 0.999
854
+ - `adam_epsilon`: 1e-08
855
+ - `max_grad_norm`: 1.0
856
+ - `num_train_epochs`: 4
857
+ - `max_steps`: -1
858
+ - `lr_scheduler_type`: linear
859
+ - `lr_scheduler_kwargs`: {}
860
+ - `warmup_ratio`: 0.1
861
+ - `warmup_steps`: 0
862
+ - `log_level`: passive
863
+ - `log_level_replica`: warning
864
+ - `log_on_each_node`: True
865
+ - `logging_nan_inf_filter`: True
866
+ - `save_safetensors`: True
867
+ - `save_on_each_node`: False
868
+ - `save_only_model`: False
869
+ - `restore_callback_states_from_checkpoint`: False
870
+ - `no_cuda`: False
871
+ - `use_cpu`: False
872
+ - `use_mps_device`: False
873
+ - `seed`: 42
874
+ - `data_seed`: None
875
+ - `jit_mode_eval`: False
876
+ - `use_ipex`: False
877
+ - `bf16`: False
878
+ - `fp16`: False
879
+ - `fp16_opt_level`: O1
880
+ - `half_precision_backend`: auto
881
+ - `bf16_full_eval`: False
882
+ - `fp16_full_eval`: False
883
+ - `tf32`: None
884
+ - `local_rank`: 0
885
+ - `ddp_backend`: None
886
+ - `tpu_num_cores`: None
887
+ - `tpu_metrics_debug`: False
888
+ - `debug`: []
889
+ - `dataloader_drop_last`: False
890
+ - `dataloader_num_workers`: 0
891
+ - `dataloader_prefetch_factor`: None
892
+ - `past_index`: -1
893
+ - `disable_tqdm`: False
894
+ - `remove_unused_columns`: True
895
+ - `label_names`: None
896
+ - `load_best_model_at_end`: False
897
+ - `ignore_data_skip`: False
898
+ - `fsdp`: []
899
+ - `fsdp_min_num_params`: 0
900
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
901
+ - `fsdp_transformer_layer_cls_to_wrap`: None
902
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
903
+ - `deepspeed`: None
904
+ - `label_smoothing_factor`: 0.0
905
+ - `optim`: adamw_torch
906
+ - `optim_args`: None
907
+ - `adafactor`: False
908
+ - `group_by_length`: False
909
+ - `length_column_name`: length
910
+ - `ddp_find_unused_parameters`: None
911
+ - `ddp_bucket_cap_mb`: None
912
+ - `ddp_broadcast_buffers`: False
913
+ - `dataloader_pin_memory`: True
914
+ - `dataloader_persistent_workers`: False
915
+ - `skip_memory_metrics`: True
916
+ - `use_legacy_prediction_loop`: False
917
+ - `push_to_hub`: False
918
+ - `resume_from_checkpoint`: None
919
+ - `hub_model_id`: None
920
+ - `hub_strategy`: every_save
921
+ - `hub_private_repo`: None
922
+ - `hub_always_push`: False
923
+ - `gradient_checkpointing`: False
924
+ - `gradient_checkpointing_kwargs`: None
925
+ - `include_inputs_for_metrics`: False
926
+ - `include_for_metrics`: []
927
+ - `eval_do_concat_batches`: True
928
+ - `fp16_backend`: auto
929
+ - `push_to_hub_model_id`: None
930
+ - `push_to_hub_organization`: None
931
+ - `mp_parameters`:
932
+ - `auto_find_batch_size`: False
933
+ - `full_determinism`: False
934
+ - `torchdynamo`: None
935
+ - `ray_scope`: last
936
+ - `ddp_timeout`: 1800
937
+ - `torch_compile`: False
938
+ - `torch_compile_backend`: None
939
+ - `torch_compile_mode`: None
940
+ - `dispatch_batches`: None
941
+ - `split_batches`: None
942
+ - `include_tokens_per_second`: False
943
+ - `include_num_input_tokens_seen`: False
944
+ - `neftune_noise_alpha`: None
945
+ - `optim_target_modules`: None
946
+ - `batch_eval_metrics`: False
947
+ - `eval_on_start`: False
948
+ - `use_liger_kernel`: False
949
+ - `eval_use_gather_object`: False
950
+ - `average_tokens_across_devices`: False
951
+ - `prompts`: None
952
+ - `batch_sampler`: batch_sampler
953
+ - `multi_dataset_batch_sampler`: proportional
954
+
955
+ </details>
956
+
957
+ ### Training Logs
958
+ | Epoch | Step | Training Loss | multi stsb de loss | multi stsb es loss | multi stsb fr loss | multi stsb it loss | multi stsb nl loss | multi stsb pl loss | multi stsb pt loss | multi stsb ru loss | multi stsb zh loss | sts-eval_spearman_cosine | sts-test_spearman_cosine |
959
+ |:-----:|:-----:|:-------------:|:------------------:|:------------------:|:------------------:|:------------------:|:------------------:|:------------------:|:------------------:|:------------------:|:------------------:|:------------------------:|:------------------------:|
960
+ | 1.0 | 3240 | 4.6594 | 4.6488 | 4.6520 | 4.6401 | 4.6637 | 4.6435 | 4.6943 | 4.6786 | 4.6902 | 4.6578 | 0.5620 | - |
961
+ | 2.0 | 6480 | 4.4285 | 4.6860 | 4.6755 | 4.6796 | 4.6655 | 4.6472 | 4.7655 | 4.6910 | 4.7783 | 4.6939 | 0.6592 | - |
962
+ | 3.0 | 9720 | 4.1541 | 4.9416 | 5.0391 | 4.9025 | 4.9229 | 4.9449 | 5.0618 | 5.0057 | 5.0001 | 4.9986 | 0.6764 | - |
963
+ | 4.0 | 12960 | 3.8671 | 5.3776 | 5.5136 | 5.3842 | 5.3216 | 5.3303 | 5.4847 | 5.4591 | 5.3623 | 5.4139 | 0.6865 | 0.6617 |
964
+
965
+
966
+ ### Framework Versions
967
+ - Python: 3.11.10
968
+ - Sentence Transformers: 3.3.1
969
+ - Transformers: 4.47.1
970
+ - PyTorch: 2.3.1+cu121
971
+ - Accelerate: 1.2.1
972
+ - Datasets: 3.2.0
973
+ - Tokenizers: 0.21.0
974
+
975
+ ## Citation
976
+
977
+ ### BibTeX
978
+
979
+ #### Sentence Transformers
980
+ ```bibtex
981
+ @inproceedings{reimers-2019-sentence-bert,
982
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
983
+ author = "Reimers, Nils and Gurevych, Iryna",
984
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
985
+ month = "11",
986
+ year = "2019",
987
+ publisher = "Association for Computational Linguistics",
988
+ url = "https://arxiv.org/abs/1908.10084",
989
+ }
990
+ ```
991
+
992
+ #### CoSENTLoss
993
+ ```bibtex
994
+ @online{kexuefm-8847,
995
+ title={CoSENT: A more efficient sentence vector scheme than Sentence-BERT},
996
+ author={Su Jianlin},
997
+ year={2022},
998
+ month={Jan},
999
+ url={https://kexue.fm/archives/8847},
1000
+ }
1001
+ ```
1002
+
1003
+ <!--
1004
+ ## Glossary
1005
+
1006
+ *Clearly define terms in order to be accessible across audiences.*
1007
+ -->
1008
+
1009
+ <!--
1010
+ ## Model Card Authors
1011
+
1012
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
1013
+ -->
1014
+
1015
+ <!--
1016
+ ## Model Card Contact
1017
+
1018
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
1019
+ -->
config.json ADDED
@@ -0,0 +1,26 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "RomainDarous/pre_training_original_model",
3
+ "activation": "gelu",
4
+ "architectures": [
5
+ "DistilBertModel"
6
+ ],
7
+ "attention_dropout": 0.1,
8
+ "dim": 768,
9
+ "dropout": 0.1,
10
+ "hidden_dim": 3072,
11
+ "initializer_range": 0.02,
12
+ "max_position_embeddings": 512,
13
+ "model_type": "distilbert",
14
+ "n_heads": 12,
15
+ "n_layers": 6,
16
+ "output_hidden_states": true,
17
+ "output_past": true,
18
+ "pad_token_id": 0,
19
+ "qa_dropout": 0.1,
20
+ "seq_classif_dropout": 0.2,
21
+ "sinusoidal_pos_embds": false,
22
+ "tie_weights_": true,
23
+ "torch_dtype": "float32",
24
+ "transformers_version": "4.47.1",
25
+ "vocab_size": 119547
26
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "3.3.1",
4
+ "transformers": "4.47.1",
5
+ "pytorch": "2.3.1+cu121"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": "cosine"
10
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:bccd9e0fdf7c5ee3abdcc5f853b428f19e7c297d0030089292d638f4dc55fd93
3
+ size 538947416
modules.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ },
14
+ {
15
+ "idx": 2,
16
+ "name": "2",
17
+ "path": "2_Dense",
18
+ "type": "sentence_transformers.models.Dense"
19
+ }
20
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 128,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,67 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "100": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "101": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "102": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "103": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "clean_up_tokenization_spaces": false,
45
+ "cls_token": "[CLS]",
46
+ "do_basic_tokenize": true,
47
+ "do_lower_case": false,
48
+ "extra_special_tokens": {},
49
+ "full_tokenizer_file": null,
50
+ "mask_token": "[MASK]",
51
+ "max_len": 512,
52
+ "max_length": 128,
53
+ "model_max_length": 128,
54
+ "never_split": null,
55
+ "pad_to_multiple_of": null,
56
+ "pad_token": "[PAD]",
57
+ "pad_token_type_id": 0,
58
+ "padding_side": "right",
59
+ "sep_token": "[SEP]",
60
+ "stride": 0,
61
+ "strip_accents": null,
62
+ "tokenize_chinese_chars": true,
63
+ "tokenizer_class": "DistilBertTokenizer",
64
+ "truncation_side": "right",
65
+ "truncation_strategy": "longest_first",
66
+ "unk_token": "[UNK]"
67
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff