mrm8488 commited on
Commit
f0eaefd
·
verified ·
1 Parent(s): 73687de

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -10
README.md CHANGED
@@ -46,8 +46,6 @@ widget:
46
  - Un gato está mirando hacia la cámara también.
47
  - '"Sí, no deseo estar presente durante este testimonio", declaró tranquilamente
48
  Peterson, de 31 años, al juez cuando fue devuelto a su celda.'
49
- datasets:
50
- - clibrain/stsb_multi_es_aug_gpt3.5-turbo_2
51
  pipeline_tag: sentence-similarity
52
  library_name: sentence-transformers
53
  metrics:
@@ -190,7 +188,7 @@ model-index:
190
 
191
  # SentenceTransformer based on nomic-ai/modernbert-embed-base
192
 
193
- This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [nomic-ai/modernbert-embed-base](https://huggingface.co/nomic-ai/modernbert-embed-base) on the [stsb_multi_es_aug_gpt3.5-turbo_2](https://huggingface.co/datasets/clibrain/stsb_multi_es_aug_gpt3.5-turbo_2) dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
194
 
195
  ## Model Details
196
 
@@ -201,9 +199,7 @@ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [n
201
  - **Output Dimensionality:** 768 dimensions
202
  - **Similarity Function:** Cosine Similarity
203
  - **Training Dataset:**
204
- - [stsb_multi_es_aug_gpt3.5-turbo_2](https://huggingface.co/datasets/clibrain/stsb_multi_es_aug_gpt3.5-turbo_2)
205
- <!-- - **Language:** Unknown -->
206
- <!-- - **License:** Unknown -->
207
 
208
  ### Model Sources
209
 
@@ -307,9 +303,8 @@ You can finetune this model on your own dataset.
307
 
308
  ### Training Dataset
309
 
310
- #### stsb_multi_es_aug_gpt3.5-turbo_2
311
 
312
- * Dataset: [stsb_multi_es_aug_gpt3.5-turbo_2](https://huggingface.co/datasets/clibrain/stsb_multi_es_aug_gpt3.5-turbo_2) at [3567b77](https://huggingface.co/datasets/clibrain/stsb_multi_es_aug_gpt3.5-turbo_2/tree/3567b77024bc5cc6372e058c9f05107deb361664)
313
  * Size: 2,697 training samples
314
  * Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>score</code>
315
  * Approximate statistics based on the first 1000 samples:
@@ -347,9 +342,8 @@ You can finetune this model on your own dataset.
347
 
348
  ### Evaluation Dataset
349
 
350
- #### stsb_multi_es_aug_gpt3.5-turbo_2
351
 
352
- * Dataset: [stsb_multi_es_aug_gpt3.5-turbo_2](https://huggingface.co/datasets/clibrain/stsb_multi_es_aug_gpt3.5-turbo_2) at [3567b77](https://huggingface.co/datasets/clibrain/stsb_multi_es_aug_gpt3.5-turbo_2/tree/3567b77024bc5cc6372e058c9f05107deb361664)
353
  * Size: 697 evaluation samples
354
  * Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>score</code>
355
  * Approximate statistics based on the first 697 samples:
 
46
  - Un gato está mirando hacia la cámara también.
47
  - '"Sí, no deseo estar presente durante este testimonio", declaró tranquilamente
48
  Peterson, de 31 años, al juez cuando fue devuelto a su celda.'
 
 
49
  pipeline_tag: sentence-similarity
50
  library_name: sentence-transformers
51
  metrics:
 
188
 
189
  # SentenceTransformer based on nomic-ai/modernbert-embed-base
190
 
191
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [nomic-ai/modernbert-embed-base](https://huggingface.co/nomic-ai/modernbert-embed-base) on the stsb_multi_es_augmented (private) dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
192
 
193
  ## Model Details
194
 
 
199
  - **Output Dimensionality:** 768 dimensions
200
  - **Similarity Function:** Cosine Similarity
201
  - **Training Dataset:**
202
+ - Private stsb dataset
 
 
203
 
204
  ### Model Sources
205
 
 
303
 
304
  ### Training Dataset
305
 
306
+ #### stsb_multi_es_augmented (private)
307
 
 
308
  * Size: 2,697 training samples
309
  * Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>score</code>
310
  * Approximate statistics based on the first 1000 samples:
 
342
 
343
  ### Evaluation Dataset
344
 
345
+ #### stsb_multi_es_augmented (private)
346
 
 
347
  * Size: 697 evaluation samples
348
  * Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>score</code>
349
  * Approximate statistics based on the first 697 samples: