yano0 commited on
Commit
64f5c2d
·
verified ·
1 Parent(s): 8791d08

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +29 -73
README.md CHANGED
@@ -1,5 +1,6 @@
1
  ---
2
- language: []
 
3
  library_name: sentence-transformers
4
  tags:
5
  - sentence-transformers
@@ -18,46 +19,11 @@ metrics:
18
  - spearman_max
19
  widget: []
20
  pipeline_tag: sentence-similarity
21
- model-index:
22
- - name: SentenceTransformer
23
- results:
24
- - task:
25
- type: semantic-similarity
26
- name: Semantic Similarity
27
- dataset:
28
- name: Unknown
29
- type: unknown
30
- metrics:
31
- - type: pearson_cosine
32
- value: 0.841929698952355
33
- name: Pearson Cosine
34
- - type: spearman_cosine
35
- value: 0.7942182059969294
36
- name: Spearman Cosine
37
- - type: pearson_manhattan
38
- value: 0.8295844701949633
39
- name: Pearson Manhattan
40
- - type: spearman_manhattan
41
- value: 0.7967029159438351
42
- name: Spearman Manhattan
43
- - type: pearson_euclidean
44
- value: 0.8302175995746677
45
- name: Pearson Euclidean
46
- - type: spearman_euclidean
47
- value: 0.7974109108557925
48
- name: Spearman Euclidean
49
- - type: pearson_dot
50
- value: 0.8266168802012493
51
- name: Pearson Dot
52
- - type: spearman_dot
53
- value: 0.7757964222446627
54
- name: Spearman Dot
55
- - type: pearson_max
56
- value: 0.841929698952355
57
- name: Pearson Max
58
- - type: spearman_max
59
- value: 0.7974109108557925
60
- name: Spearman Max
61
  ---
62
 
63
  # SentenceTransformer
@@ -76,12 +42,6 @@ This is a [sentence-transformers](https://www.SBERT.net) model trained. It maps
76
  <!-- - **Language:** Unknown -->
77
  <!-- - **License:** Unknown -->
78
 
79
- ### Model Sources
80
-
81
- - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
82
- - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
83
- - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
84
-
85
  ### Full Model Architecture
86
 
87
  ```
@@ -147,26 +107,6 @@ You can finetune this model on your own dataset.
147
  *List how the model may foreseeably be misused and address what users ought not to do with the model.*
148
  -->
149
 
150
- ## Evaluation
151
-
152
- ### Metrics
153
-
154
- #### Semantic Similarity
155
-
156
- * Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
157
-
158
- | Metric | Value |
159
- |:--------------------|:-----------|
160
- | pearson_cosine | 0.8419 |
161
- | **spearman_cosine** | **0.7942** |
162
- | pearson_manhattan | 0.8296 |
163
- | spearman_manhattan | 0.7967 |
164
- | pearson_euclidean | 0.8302 |
165
- | spearman_euclidean | 0.7974 |
166
- | pearson_dot | 0.8266 |
167
- | spearman_dot | 0.7758 |
168
- | pearson_max | 0.8419 |
169
- | spearman_max | 0.7974 |
170
 
171
  <!--
172
  ## Bias, Risks and Limitations
@@ -182,12 +122,6 @@ You can finetune this model on your own dataset.
182
 
183
  ## Training Details
184
 
185
- ### Training Logs
186
- | Epoch | Step | spearman_cosine |
187
- |:-----:|:----:|:---------------:|
188
- | 0 | 0 | 0.7942 |
189
-
190
-
191
  ### Framework Versions
192
  - Python: 3.10.13
193
  - Sentence Transformers: 3.0.0
@@ -196,6 +130,28 @@ You can finetune this model on your own dataset.
196
  - Accelerate: 0.30.1
197
  - Datasets: 2.19.2
198
  - Tokenizers: 0.19.1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
199
 
200
  ## Citation
201
 
 
1
  ---
2
+ language:
3
+ - ja
4
  library_name: sentence-transformers
5
  tags:
6
  - sentence-transformers
 
19
  - spearman_max
20
  widget: []
21
  pipeline_tag: sentence-similarity
22
+ datasets:
23
+ - hpprc/emb
24
+ - hpprc/mqa-ja
25
+ - google-research-datasets/paws-x
26
+ base_model: pkshatech/GLuCoSE-base-ja
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
27
  ---
28
 
29
  # SentenceTransformer
 
42
  <!-- - **Language:** Unknown -->
43
  <!-- - **License:** Unknown -->
44
 
 
 
 
 
 
 
45
  ### Full Model Architecture
46
 
47
  ```
 
107
  *List how the model may foreseeably be misused and address what users ought not to do with the model.*
108
  -->
109
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
110
 
111
  <!--
112
  ## Bias, Risks and Limitations
 
122
 
123
  ## Training Details
124
 
 
 
 
 
 
 
125
  ### Framework Versions
126
  - Python: 3.10.13
127
  - Sentence Transformers: 3.0.0
 
130
  - Accelerate: 0.30.1
131
  - Datasets: 2.19.2
132
  - Tokenizers: 0.19.1
133
+ ## Benchmarks
134
+
135
+ ## Zero-shot Search
136
+ Evaluated with [MIRACL-ja](https://huggingface.co/datasets/miracl/miracl), [JQARA][https://huggingface.co/datasets/hotchpotch/JQaRA] and [MLDR-ja][https://huggingface.co/datasets/Shitao/MLDR].
137
+
138
+ | model | size | MIRACL<br>Recall@5 | JQaRA<br>nDCG@10 | MLDR<br>nDCG@10 |
139
+ |--------|--------|---------------------|-------------------|-------------------|
140
+ | me5-base | 0.3B | 84.2 | 47.2 | 25.4 |
141
+ | GLuCoSE | 0.1B | 53.3 | 30.8 | 25.2 |
142
+ | GLuCoSE v2 | 0.1B | 85.5 | 60.6 | 33.8 |
143
+
144
+ ## JMTEB
145
+ Evaluated with [JMTEB][https://github.com/sbintuitions/JMTEB].
146
+ * Time-consuming [‘amazon_review_classification’, ‘mrtydi’, ‘jaqket’, ‘esci’] were excluded and evaluated.
147
+ * The average is a macro-average per task.
148
+
149
+ | model | size | Class. | Ret. | STS. | Clus. | Pair. | Avg. |
150
+ |--------|--------|--------|------|------|-------|-------|------|
151
+ | me5-base | 0.3B | 75.1 | 80.6 | 80.5 | 52.6 | 62.4 | 70.2 |
152
+ | GLuCoSE | 0.1B | 82.6 | 69.8 | 78.2 | 51.5 | 66.2 | 69.7 |
153
+ | GLuCoSE v2 | 0.1B | 80.5 | 82.8 | 83.0 | 49.8 | 62.4 | 71.7 |
154
+
155
 
156
  ## Citation
157