State-of-the-Art Results Comparison (MTEB STS Multilingual Leaderboard)

Dataset State-of-the-art (Multi) STSb-XLM-RoBERTa-base STS Multilingual MPNet base v2
Average 73.17 71.68 73.89
STS17 (ar-ar) 81.87 80.43 81.24
STS17 (en-ar) 81.22 76.3 77.03
STS17 (en-de) 87.3 91.06 91.09
STS17 (en-tr) 77.18 80.74 79.87
STS17 (es-en) 88.24 83.09 85.53
STS17 (es-es) 88.25 84.16 87.27
STS17 (fr-en) 88.06 91.33 90.68
STS17 (it-en) 89.68 92.87 92.47
STS17 (ko-ko) 83.69 97.67 97.66
STS17 (nl-en) 88.25 92.13 91.15
STS22 (ar) 58.67 58.67 62.66
STS22 (de) 60.12 52.17 57.74
STS22 (de-en) 60.92 58.5 57.5
STS22 (de-fr) 67.79 51.28 57.99
STS22 (de-pl) 58.69 44.56 44.22
STS22 (es) 68.57 63.68 66.21
STS22 (es-en) 78.8 70.65 75.18
STS22 (es-it) 75.04 60.88 66.25
STS22 (fr) 83.75 76.46 78.76
STS22 (fr-pl) 84.52 84.52 84.52
STS22 (it) 79.28 66.73 68.47
STS22 (pl) 42.08 41.18 43.36
STS22 (pl-en) 77.5 64.35 75.11
STS22 (ru) 61.71 58.59 58.67
STS22 (tr) 68.72 57.52 63.84
STS22 (zh-en) 71.88 60.69 65.37
STSb 89.86 95.05 95.15

Bold indicates the best result in each row.

SentenceTransformer based on sentence-transformers/paraphrase-multilingual-mpnet-base-v2

This is a sentence-transformers model finetuned from sentence-transformers/paraphrase-multilingual-mpnet-base-v2. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: XLMRobertaModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("Gameselo/STS-multilingual-mpnet-base-v2")
# Run inference
sentences = [
    '一个女人正在洗澡。',
    'A woman is taking a bath.',
    'En jente børster håret sitt',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Semantic Similarity

Metric Value
pearson_cosine 0.9551
spearman_cosine 0.9593
pearson_manhattan 0.927
spearman_manhattan 0.9383
pearson_euclidean 0.9278
spearman_euclidean 0.9394
pearson_dot 0.876
spearman_dot 0.8865
pearson_max 0.9551
spearman_max 0.9593

Evalutation results vs SOTA results

Metric Value
pearson_cosine 0.948
spearman_cosine 0.9515
pearson_manhattan 0.9252
spearman_manhattan 0.9352
pearson_euclidean 0.9258
spearman_euclidean 0.9364
pearson_dot 0.8443
spearman_dot 0.8435
pearson_max 0.948
spearman_max 0.9515

Training Details

Training Dataset

Unnamed Dataset

  • Size: 226,547 training samples
  • Columns: sentence_0, sentence_1, and label
  • Approximate statistics based on the first 1000 samples:
    sentence_0 sentence_1 label
    type string string float
    details
    • min: 3 tokens
    • mean: 20.05 tokens
    • max: 128 tokens
    • min: 4 tokens
    • mean: 19.94 tokens
    • max: 128 tokens
    • min: 0.0
    • mean: 1.92
    • max: 398.6
  • Samples:
    sentence_0 sentence_1 label
    Bir kadın makineye dikiş dikiyor. Bir kadın biraz et ekiyor. 0.12
    Snowden 'gegeven vluchtelingendocument door Ecuador'. Snowden staat op het punt om uit Moskou te vliegen 0.24000000953674316
    Czarny pies idzie mostem przez wodę Czarny pies nie idzie mostem przez wodę 0.74000000954
  • Loss: AnglELoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "pairwise_angle_sim"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • per_device_train_batch_size: 256
  • per_device_eval_batch_size: 256
  • num_train_epochs: 10
  • multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • prediction_loss_only: True
  • per_device_train_batch_size: 256
  • per_device_eval_batch_size: 256
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 10
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: round_robin

Training Logs

Epoch Step Training Loss sts-dev_spearman_cosine sts-test_spearman_cosine
0.5650 500 10.9426 - -
1.0 885 - 0.9202 -
1.1299 1000 9.7184 - -
1.6949 1500 9.5348 - -
2.0 1770 - 0.9400 -
2.2599 2000 9.4412 - -
2.8249 2500 9.3097 - -
3.0 2655 - 0.9489 -
3.3898 3000 9.2357 - -
3.9548 3500 9.1594 - -
4.0 3540 - 0.9528 -
4.5198 4000 9.0963 - -
5.0 4425 - 0.9553 -
5.0847 4500 9.0382 - -
5.6497 5000 8.9837 - -
6.0 5310 - 0.9567 -
6.2147 5500 8.9403 - -
6.7797 6000 8.8841 - -
7.0 6195 - 0.9581 -
7.3446 6500 8.8513 - -
7.9096 7000 8.81 - -
8.0 7080 - 0.9582 -
8.4746 7500 8.8069 - -
9.0 7965 - 0.9589 -
9.0395 8000 8.7616 - -
9.6045 8500 8.7521 - -
10.0 8850 - 0.9593 0.6266

Framework Versions

  • Python: 3.9.7
  • Sentence Transformers: 3.0.0
  • Transformers: 4.40.1
  • PyTorch: 2.3.0+cu121
  • Accelerate: 0.29.3
  • Datasets: 2.19.0
  • Tokenizers: 0.19.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

AnglELoss

@misc{li2023angleoptimized,
    title={AnglE-optimized Text Embeddings}, 
    author={Xianming Li and Jing Li},
    year={2023},
    eprint={2309.12871},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
353
Safetensors
Model size
278M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Gameselo/STS-multilingual-mpnet-base-v2

Spaces using Gameselo/STS-multilingual-mpnet-base-v2 3

Evaluation results