tomaarsen HF staff commited on
Commit
e2a1855
·
verified ·
1 Parent(s): 477b3ec

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +173 -49
README.md CHANGED
@@ -9,48 +9,133 @@ tags:
9
  - generated
10
  base_model: microsoft/mpnet-base
11
  metrics:
12
- - accuracy
 
 
 
 
 
 
 
 
 
13
  widget:
14
- - source_sentence: Many youth are lazy.
15
  sentences:
16
- - Lincoln took his hat off.
17
  - At the end of the fourth century was when baked goods flourished.
18
- - DOD's common practice for managing this environment has been to create aggressive
19
- risk reduction efforts in its programs.
20
- - source_sentence: a guy on a bike
21
  sentences:
22
- - A man is on a bike.
23
- - two men sit in a train car
24
- - She is the boy's aunt.
25
- - source_sentence: The dog is wet.
26
  sentences:
27
- - A child and small dog running.
28
- - The man is riding a sheep.
29
- - The man is doing a bike trick.
30
  - source_sentence: yeah really no kidding
31
  sentences:
32
- - 'Really? No kidding! '
33
  - yeah i mean just when uh the they military paid for her education
34
- - Changes were made to the Grant Renewal Application to provide extra information
35
- to the LSC.
36
- - source_sentence: 'Harlem did a great job '
37
  sentences:
38
- - 'Missouri was happy to continue it''s planning efforts. '
39
  - yeah i mean just when uh the they military paid for her education
40
- - I know exactly.
 
41
  pipeline_tag: sentence-similarity
42
  co2_eq_emissions:
43
- emissions: 18.165192544667764
44
  source: codecarbon
45
  training_type: fine-tuning
46
  on_cloud: false
47
  cpu_model: 13th Gen Intel(R) Core(TM) i7-13700K
48
  ram_total_size: 31.777088165283203
49
- hours_used: 0.141
50
  hardware_used: 1 x NVIDIA GeForce RTX 3090
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
51
  ---
52
 
53
- # SentenceTransformer
54
 
55
  This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [microsoft/mpnet-base](https://huggingface.co/microsoft/mpnet-base) on the [multi_nli](https://huggingface.co/datasets/nyu-mll/multi_nli), [snli](https://huggingface.co/datasets/stanfordnlp/snli) and [stsb](https://huggingface.co/datasets/mteb/stsbenchmark-sts) datasets. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
56
 
@@ -98,11 +183,11 @@ Then you can load this model and run inference.
98
  from sentence_transformers import SentenceTransformer
99
 
100
  # Download from the 🤗 Hub
101
- model = SentenceTransformer("tomaarsen/st-v3-test-mpnet-base-allnli-stsb")
102
  # Run inference
103
  sentences = [
104
- "Harlem did a great job ",
105
- "Missouri was happy to continue it's planning efforts. ",
106
  "yeah i mean just when uh the they military paid for her education",
107
  ]
108
  embeddings = model.encode(sentences)
@@ -134,6 +219,44 @@ You can finetune this model on your own dataset.
134
  *List how the model may foreseeably be misused and address what users ought not to do with the model.*
135
  -->
136
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
137
  <!--
138
  ## Bias, Risks and Limitations
139
 
@@ -391,34 +514,34 @@ You can finetune this model on your own dataset.
391
  </details>
392
 
393
  ### Training Logs
394
- | Epoch | Step | Training Loss | multi_nli | snli | stsb |
395
- |:------:|:----:|:-------------:|:---------:|:------:|:------:|
396
- | 0.0493 | 10 | 0.9204 | 1.0998 | 1.1022 | 0.2997 |
397
- | 0.0985 | 20 | 1.0074 | 1.0983 | 1.0971 | 0.2499 |
398
- | 0.1478 | 30 | 1.0037 | 1.0994 | 1.0939 | 0.1667 |
399
- | 0.1970 | 40 | 0.7961 | 1.0945 | 1.0877 | 0.0814 |
400
- | 0.2463 | 50 | 0.9882 | 1.0950 | 1.0806 | 0.0840 |
401
- | 0.2956 | 60 | 0.7814 | 1.0873 | 1.0711 | 0.0681 |
402
- | 0.3448 | 70 | 0.6678 | 1.0829 | 1.0673 | 0.0504 |
403
- | 0.3941 | 80 | 0.7669 | 1.0771 | 1.0638 | 0.0501 |
404
- | 0.4433 | 90 | 0.9718 | 1.0704 | 1.0517 | 0.0482 |
405
- | 0.4926 | 100 | 0.8494 | 1.0609 | 1.0388 | 0.0526 |
406
- | 0.5419 | 110 | 0.745 | 1.0631 | 1.0285 | 0.0527 |
407
- | 0.5911 | 120 | 0.6416 | 1.0564 | 1.0148 | 0.0588 |
408
- | 0.6404 | 130 | 1.0331 | 1.0504 | 1.0026 | 0.0627 |
409
- | 0.6897 | 140 | 0.8305 | 1.0417 | 1.0023 | 0.0664 |
410
- | 0.7389 | 150 | 0.7362 | 1.0282 | 0.9937 | 0.0672 |
411
- | 0.7882 | 160 | 0.7164 | 1.0288 | 0.9930 | 0.0688 |
412
- | 0.8374 | 170 | 0.8217 | 1.0264 | 0.9819 | 0.0677 |
413
- | 0.8867 | 180 | 0.9046 | 1.0200 | 0.9734 | 0.0742 |
414
- | 0.9360 | 190 | 0.5327 | 1.0221 | 0.9764 | 0.0698 |
415
- | 0.9852 | 200 | 0.8974 | 1.0233 | 0.9776 | 0.0691 |
416
 
417
 
418
  ### Environmental Impact
419
  Carbon emissions were measured using [CodeCarbon](https://github.com/mlco2/codecarbon).
420
  - **Carbon Emitted**: 0.018 kg of CO2
421
- - **Hours Used**: 0.141 hours
422
 
423
  ### Training Hardware
424
  - **On Cloud**: No
@@ -438,7 +561,8 @@ Carbon emissions were measured using [CodeCarbon](https://github.com/mlco2/codec
438
  ## Citation
439
 
440
  ### BibTeX
441
- #### Sentence Transformers
 
442
  ```bibtex
443
  @inproceedings{reimers-2019-sentence-bert,
444
  title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
 
9
  - generated
10
  base_model: microsoft/mpnet-base
11
  metrics:
12
+ - pearson_cosine
13
+ - spearman_cosine
14
+ - pearson_manhattan
15
+ - spearman_manhattan
16
+ - pearson_euclidean
17
+ - spearman_euclidean
18
+ - pearson_dot
19
+ - spearman_dot
20
+ - pearson_max
21
+ - spearman_max
22
  widget:
23
+ - source_sentence: 'Really? No kidding! '
24
  sentences:
25
+ - yeah really no kidding
26
  - At the end of the fourth century was when baked goods flourished.
27
+ - The campaigns seem to reach a new pool of contributors.
28
+ - source_sentence: A sleeping man.
 
29
  sentences:
30
+ - Two men are sleeping.
31
+ - Someone is selling oranges
32
+ - the family is young
33
+ - source_sentence: a guy on a bike
34
  sentences:
35
+ - A tall person on a bike
36
+ - A man is on a frozen lake.
37
+ - The women throw food at the kids
38
  - source_sentence: yeah really no kidding
39
  sentences:
40
+ - oh uh-huh well no they wouldn't would they no
41
  - yeah i mean just when uh the they military paid for her education
42
+ - The campaigns seem to reach a new pool of contributors.
43
+ - source_sentence: He ran like an athlete.
 
44
  sentences:
45
+ - ' Then he ran.'
46
  - yeah i mean just when uh the they military paid for her education
47
+ - Similarly, OIM revised the electronic Grant Renewal Application to accommodate
48
+ new information sought by LSC and to ensure greater ease for users.
49
  pipeline_tag: sentence-similarity
50
  co2_eq_emissions:
51
+ emissions: 17.515467907816664
52
  source: codecarbon
53
  training_type: fine-tuning
54
  on_cloud: false
55
  cpu_model: 13th Gen Intel(R) Core(TM) i7-13700K
56
  ram_total_size: 31.777088165283203
57
+ hours_used: 0.13
58
  hardware_used: 1 x NVIDIA GeForce RTX 3090
59
+ model-index:
60
+ - name: SentenceTransformer based on microsoft/mpnet-base
61
+ results:
62
+ - task:
63
+ type: semantic-similarity
64
+ name: Semantic Similarity
65
+ dataset:
66
+ name: sts dev
67
+ type: sts-dev
68
+ metrics:
69
+ - type: pearson_cosine
70
+ value: 0.7331234146933103
71
+ name: Pearson Cosine
72
+ - type: spearman_cosine
73
+ value: 0.7435439430716654
74
+ name: Spearman Cosine
75
+ - type: pearson_manhattan
76
+ value: 0.7389474504545281
77
+ name: Pearson Manhattan
78
+ - type: spearman_manhattan
79
+ value: 0.7473580293303098
80
+ name: Spearman Manhattan
81
+ - type: pearson_euclidean
82
+ value: 0.7356264396007131
83
+ name: Pearson Euclidean
84
+ - type: spearman_euclidean
85
+ value: 0.7436137284782617
86
+ name: Spearman Euclidean
87
+ - type: pearson_dot
88
+ value: 0.7093073700072118
89
+ name: Pearson Dot
90
+ - type: spearman_dot
91
+ value: 0.7150453113301433
92
+ name: Spearman Dot
93
+ - type: pearson_max
94
+ value: 0.7389474504545281
95
+ name: Pearson Max
96
+ - type: spearman_max
97
+ value: 0.7473580293303098
98
+ name: Spearman Max
99
+ - task:
100
+ type: semantic-similarity
101
+ name: Semantic Similarity
102
+ dataset:
103
+ name: sts test
104
+ type: sts-test
105
+ metrics:
106
+ - type: pearson_cosine
107
+ value: 0.6750510843835755
108
+ name: Pearson Cosine
109
+ - type: spearman_cosine
110
+ value: 0.6615639695746663
111
+ name: Spearman Cosine
112
+ - type: pearson_manhattan
113
+ value: 0.6718085205234632
114
+ name: Pearson Manhattan
115
+ - type: spearman_manhattan
116
+ value: 0.6589482932175834
117
+ name: Spearman Manhattan
118
+ - type: pearson_euclidean
119
+ value: 0.6693170762111229
120
+ name: Pearson Euclidean
121
+ - type: spearman_euclidean
122
+ value: 0.6578210069410166
123
+ name: Spearman Euclidean
124
+ - type: pearson_dot
125
+ value: 0.6490291380804283
126
+ name: Pearson Dot
127
+ - type: spearman_dot
128
+ value: 0.6335192601696299
129
+ name: Spearman Dot
130
+ - type: pearson_max
131
+ value: 0.6750510843835755
132
+ name: Pearson Max
133
+ - type: spearman_max
134
+ value: 0.6615639695746663
135
+ name: Spearman Max
136
  ---
137
 
138
+ # SentenceTransformer based on microsoft/mpnet-base
139
 
140
  This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [microsoft/mpnet-base](https://huggingface.co/microsoft/mpnet-base) on the [multi_nli](https://huggingface.co/datasets/nyu-mll/multi_nli), [snli](https://huggingface.co/datasets/stanfordnlp/snli) and [stsb](https://huggingface.co/datasets/mteb/stsbenchmark-sts) datasets. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
141
 
 
183
  from sentence_transformers import SentenceTransformer
184
 
185
  # Download from the 🤗 Hub
186
+ model = SentenceTransformer("sentence_transformers_model_id")
187
  # Run inference
188
  sentences = [
189
+ "He ran like an athlete.",
190
+ " Then he ran.",
191
  "yeah i mean just when uh the they military paid for her education",
192
  ]
193
  embeddings = model.encode(sentences)
 
219
  *List how the model may foreseeably be misused and address what users ought not to do with the model.*
220
  -->
221
 
222
+ ## Evaluation
223
+
224
+ ### Metrics
225
+
226
+ #### Semantic Similarity
227
+ * Dataset: `sts-dev`
228
+ * Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
229
+
230
+ | Metric | Value |
231
+ |:--------------------|:-----------|
232
+ | pearson_cosine | 0.7331 |
233
+ | **spearman_cosine** | **0.7435** |
234
+ | pearson_manhattan | 0.7389 |
235
+ | spearman_manhattan | 0.7474 |
236
+ | pearson_euclidean | 0.7356 |
237
+ | spearman_euclidean | 0.7436 |
238
+ | pearson_dot | 0.7093 |
239
+ | spearman_dot | 0.715 |
240
+ | pearson_max | 0.7389 |
241
+ | spearman_max | 0.7474 |
242
+
243
+ #### Semantic Similarity
244
+ * Dataset: `sts-test`
245
+ * Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
246
+
247
+ | Metric | Value |
248
+ |:--------------------|:-----------|
249
+ | pearson_cosine | 0.6751 |
250
+ | **spearman_cosine** | **0.6616** |
251
+ | pearson_manhattan | 0.6718 |
252
+ | spearman_manhattan | 0.6589 |
253
+ | pearson_euclidean | 0.6693 |
254
+ | spearman_euclidean | 0.6578 |
255
+ | pearson_dot | 0.649 |
256
+ | spearman_dot | 0.6335 |
257
+ | pearson_max | 0.6751 |
258
+ | spearman_max | 0.6616 |
259
+
260
  <!--
261
  ## Bias, Risks and Limitations
262
 
 
514
  </details>
515
 
516
  ### Training Logs
517
+ | Epoch | Step | Training Loss | multi nli loss | snli loss | stsb loss | sts-dev spearman cosine |
518
+ |:------:|:----:|:-------------:|:--------------:|:---------:|:---------:|:-----------------------:|
519
+ | 0.0493 | 10 | 0.9199 | 1.1019 | 1.1017 | 0.3016 | 0.6324 |
520
+ | 0.0985 | 20 | 1.0063 | 1.1000 | 1.0966 | 0.2635 | 0.6093 |
521
+ | 0.1478 | 30 | 1.002 | 1.0995 | 1.0908 | 0.1766 | 0.5328 |
522
+ | 0.1970 | 40 | 0.7946 | 1.0980 | 1.0913 | 0.0923 | 0.5991 |
523
+ | 0.2463 | 50 | 0.9891 | 1.0967 | 1.0781 | 0.0912 | 0.6457 |
524
+ | 0.2956 | 60 | 0.784 | 1.0938 | 1.0699 | 0.0934 | 0.6629 |
525
+ | 0.3448 | 70 | 0.6735 | 1.0940 | 1.0728 | 0.0640 | 0.7538 |
526
+ | 0.3941 | 80 | 0.7713 | 1.0893 | 1.0676 | 0.0612 | 0.7653 |
527
+ | 0.4433 | 90 | 0.9772 | 1.0870 | 1.0573 | 0.0636 | 0.7621 |
528
+ | 0.4926 | 100 | 0.8613 | 1.0862 | 1.0515 | 0.0632 | 0.7583 |
529
+ | 0.5419 | 110 | 0.7528 | 1.0814 | 1.0397 | 0.0617 | 0.7536 |
530
+ | 0.5911 | 120 | 0.6541 | 1.0854 | 1.0329 | 0.0657 | 0.7512 |
531
+ | 0.6404 | 130 | 1.051 | 1.0658 | 1.0211 | 0.0607 | 0.7340 |
532
+ | 0.6897 | 140 | 0.8516 | 1.0631 | 1.0171 | 0.0587 | 0.7467 |
533
+ | 0.7389 | 150 | 0.7484 | 1.0563 | 1.0122 | 0.0556 | 0.7537 |
534
+ | 0.7882 | 160 | 0.7368 | 1.0534 | 1.0100 | 0.0588 | 0.7526 |
535
+ | 0.8374 | 170 | 0.8373 | 1.0498 | 1.0030 | 0.0565 | 0.7491 |
536
+ | 0.8867 | 180 | 0.9311 | 1.0387 | 0.9981 | 0.0588 | 0.7302 |
537
+ | 0.9360 | 190 | 0.5445 | 1.0357 | 0.9967 | 0.0565 | 0.7382 |
538
+ | 0.9852 | 200 | 0.9154 | 1.0359 | 0.9964 | 0.0556 | 0.7435 |
539
 
540
 
541
  ### Environmental Impact
542
  Carbon emissions were measured using [CodeCarbon](https://github.com/mlco2/codecarbon).
543
  - **Carbon Emitted**: 0.018 kg of CO2
544
+ - **Hours Used**: 0.13 hours
545
 
546
  ### Training Hardware
547
  - **On Cloud**: No
 
561
  ## Citation
562
 
563
  ### BibTeX
564
+
565
+ #### Sentence Transformers and SoftmaxLoss
566
  ```bibtex
567
  @inproceedings{reimers-2019-sentence-bert,
568
  title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",