SentenceTransformer based on sentence-transformers/paraphrase-multilingual-mpnet-base-v2

This is a sentence-transformers model finetuned from sentence-transformers/paraphrase-multilingual-mpnet-base-v2 on the allstats-semantic-search-synthetic-dataset-v1 dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: XLMRobertaModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("yahyaabd/allstats-semantic-search-model-v1-2")
# Run inference
sentences = [
    'Berapa persen deflasi ysng terjadi paa Maret 2010?',
    'Pada Bulan Maret 2010 Terjadi Deflasi Sebesar 0,14 Persen.',
    'Inflasi September 2008 sebesar 0,97 persen.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Semantic Similarity

Metric allstats-semantic-search-v1-2-dev allstat-semantic-search-v1-test
pearson_cosine 0.9923 0.9929
spearman_cosine 0.9293 0.9283

Training Details

Training Dataset

allstats-semantic-search-synthetic-dataset-v1

  • Dataset: allstats-semantic-search-synthetic-dataset-v1 at c477abf
  • Size: 212,930 training samples
  • Columns: query, doc, and label
  • Approximate statistics based on the first 1000 samples:
    query doc label
    type string string float
    details
    • min: 5 tokens
    • mean: 11.52 tokens
    • max: 32 tokens
    • min: 4 tokens
    • mean: 14.85 tokens
    • max: 70 tokens
    • min: 0.0
    • mean: 0.51
    • max: 1.0
  • Samples:
    query doc label
    studi tentang kemiskinan urban Perkembangan Mingguan Harga Eceran Beberapa Bahan Pokok di Ibukota Provinsi Seluruh Indonesia (Juli-Desember 2018) 0.1
    Harga gabah di tingkat produsen bulan September Upah Buruh Juli 2020 0.1
    Data perusahaan konstruksi di wilayah timur Indonesia thn 2013 Direktori Perusahaan Konstruksi 2013 Buku 6 Maluku dan Papua 0.92
  • Loss: CosineSimilarityLoss with these parameters:
    {
        "loss_fct": "torch.nn.modules.loss.MSELoss"
    }
    

Evaluation Dataset

allstats-semantic-search-synthetic-dataset-v1

  • Dataset: allstats-semantic-search-synthetic-dataset-v1 at c477abf
  • Size: 26,616 evaluation samples
  • Columns: query, doc, and label
  • Approximate statistics based on the first 1000 samples:
    query doc label
    type string string float
    details
    • min: 5 tokens
    • mean: 11.33 tokens
    • max: 53 tokens
    • min: 4 tokens
    • mean: 14.6 tokens
    • max: 69 tokens
    • min: 0.0
    • mean: 0.5
    • max: 1.0
  • Samples:
    query doc label
    Informasi potensi deas di Maluku 011 Statistik Potensi Desa Provinsi Maluku 2011 0.87
    Berapa persen kenaikan jumlah penumpang angkutan udara internasional pada Januari 2024 dibandingkan Desember 2023? Kenaikan jumlah penumpang bulan lainnya 0.0
    informasi tentang potensi desa jambi tahun 2005 Statistik Potensi Desa Provinsi Jambi 2005 0.85
  • Loss: CosineSimilarityLoss with these parameters:
    {
        "loss_fct": "torch.nn.modules.loss.MSELoss"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • num_train_epochs: 6
  • warmup_ratio: 0.1
  • fp16: True

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 6
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional

Training Logs

Click to expand
Epoch Step Training Loss Validation Loss allstats-semantic-search-v1-2-dev_spearman_cosine allstat-semantic-search-v1-test_spearman_cosine
0.0376 250 0.0734 0.0464 0.6927 -
0.0751 500 0.043 0.0362 0.7146 -
0.1127 750 0.0353 0.0288 0.7364 -
0.1503 1000 0.0271 0.0274 0.7571 -
0.1878 1250 0.0241 0.0225 0.7738 -
0.2254 1500 0.0228 0.0203 0.7699 -
0.2630 1750 0.0207 0.0197 0.7881 -
0.3005 2000 0.0187 0.0191 0.7900 -
0.3381 2250 0.0194 0.0183 0.7794 -
0.3757 2500 0.0182 0.0178 0.7870 -
0.4132 2750 0.0198 0.0183 0.8009 -
0.4508 3000 0.0189 0.0182 0.7912 -
0.4884 3250 0.0177 0.0168 0.7963 -
0.5259 3500 0.0178 0.0173 0.7920 -
0.5635 3750 0.017 0.0183 0.8014 -
0.6011 4000 0.0186 0.0180 0.7777 -
0.6386 4250 0.0187 0.0167 0.7976 -
0.6762 4500 0.015 0.0154 0.8194 -
0.7137 4750 0.0158 0.0157 0.8062 -
0.7513 5000 0.0152 0.0148 0.8117 -
0.7889 5250 0.0148 0.0149 0.8115 -
0.8264 5500 0.0146 0.0141 0.8175 -
0.8640 5750 0.0154 0.0144 0.7951 -
0.9016 6000 0.0155 0.0152 0.8163 -
0.9391 6250 0.0145 0.0136 0.8216 -
0.9767 6500 0.0149 0.0149 0.8140 -
1.0143 6750 0.0132 0.0132 0.8179 -
1.0518 7000 0.0108 0.0124 0.8232 -
1.0894 7250 0.0109 0.0120 0.8330 -
1.1270 7500 0.0112 0.0132 0.8219 -
1.1645 7750 0.0116 0.0124 0.8226 -
1.2021 8000 0.0121 0.0120 0.8151 -
1.2397 8250 0.0109 0.0119 0.8384 -
1.2772 8500 0.0103 0.0114 0.8415 -
1.3148 8750 0.0105 0.0116 0.8191 -
1.3524 9000 0.0104 0.0122 0.8292 -
1.3899 9250 0.0108 0.0117 0.8292 -
1.4275 9500 0.011 0.0118 0.8339 -
1.4651 9750 0.0105 0.0106 0.8367 -
1.5026 10000 0.0093 0.0098 0.8467 -
1.5402 10250 0.0105 0.0101 0.8334 -
1.5778 10500 0.0102 0.0106 0.8324 -
1.6153 10750 0.01 0.0097 0.8472 -
1.6529 11000 0.0106 0.0098 0.8378 -
1.6905 11250 0.0088 0.0095 0.8531 -
1.7280 11500 0.0085 0.0095 0.8409 -
1.7656 11750 0.0089 0.0091 0.8431 -
1.8032 12000 0.0083 0.0088 0.8524 -
1.8407 12250 0.0082 0.0088 0.8591 -
1.8783 12500 0.0078 0.0092 0.8478 -
1.9159 12750 0.009 0.0085 0.8480 -
1.9534 13000 0.0082 0.0089 0.8465 -
1.9910 13250 0.0076 0.0085 0.8564 -
2.0285 13500 0.0059 0.0082 0.8602 -
2.0661 13750 0.0073 0.0081 0.8558 -
2.1037 14000 0.0075 0.0081 0.8492 -
2.1412 14250 0.0066 0.0077 0.8520 -
2.1788 14500 0.0066 0.0076 0.8599 -
2.2164 14750 0.007 0.0080 0.8589 -
2.2539 15000 0.0065 0.0076 0.8552 -
2.2915 15250 0.0071 0.0075 0.8604 -
2.3291 15500 0.0062 0.0073 0.8714 -
2.3666 15750 0.0058 0.0069 0.8714 -
2.4042 16000 0.0066 0.0072 0.8570 -
2.4418 16250 0.0058 0.0069 0.8757 -
2.4793 16500 0.0059 0.0067 0.8726 -
2.5169 16750 0.0057 0.0067 0.8663 -
2.5545 17000 0.0058 0.0068 0.8703 -
2.5920 17250 0.0058 0.0068 0.8765 -
2.6296 17500 0.006 0.0067 0.8729 -
2.6672 17750 0.0057 0.0067 0.8689 -
2.7047 18000 0.0055 0.0065 0.8750 -
2.7423 18250 0.0056 0.0066 0.8734 -
2.7799 18500 0.0053 0.0062 0.8745 -
2.8174 18750 0.0053 0.0062 0.8814 -
2.8550 19000 0.0048 0.0063 0.8839 -
2.8926 19250 0.005 0.0063 0.8741 -
2.9301 19500 0.0063 0.0061 0.8752 -
2.9677 19750 0.0052 0.0059 0.8790 -
3.0053 20000 0.0049 0.0058 0.8825 -
3.0428 20250 0.0042 0.0059 0.8787 -
3.0804 20500 0.0043 0.0056 0.8839 -
3.1180 20750 0.0036 0.0058 0.8870 -
3.1555 21000 0.004 0.0056 0.8825 -
3.1931 21250 0.0041 0.0056 0.8884 -
3.2307 21500 0.004 0.0054 0.8872 -
3.2682 21750 0.0044 0.0052 0.8838 -
3.3058 22000 0.0036 0.0053 0.8904 -
3.3434 22250 0.0036 0.0054 0.8898 -
3.3809 22500 0.0037 0.0051 0.8938 -
3.4185 22750 0.0036 0.0051 0.8953 -
3.4560 23000 0.0036 0.0051 0.8935 -
3.4936 23250 0.004 0.0049 0.8955 -
3.5312 23500 0.0033 0.0051 0.8912 -
3.5687 23750 0.0037 0.0048 0.8995 -
3.6063 24000 0.0037 0.0048 0.8887 -
3.6439 24250 0.0037 0.0048 0.8921 -
3.6814 24500 0.0034 0.0046 0.9001 -
3.7190 24750 0.0041 0.0048 0.9008 -
3.7566 25000 0.0037 0.0048 0.8928 -
3.7941 25250 0.0038 0.0049 0.8949 -
3.8317 25500 0.0037 0.0045 0.9029 -
3.8693 25750 0.0034 0.0057 0.8962 -
3.9068 26000 0.0035 0.0047 0.8963 -
3.9444 26250 0.0039 0.0044 0.9026 -
3.9820 26500 0.0034 0.0044 0.8994 -
4.0195 26750 0.0029 0.0042 0.9039 -
4.0571 27000 0.0025 0.0040 0.9047 -
4.0947 27250 0.0027 0.0041 0.9033 -
4.1322 27500 0.0027 0.0041 0.9034 -
4.1698 27750 0.0025 0.0040 0.9040 -
4.2074 28000 0.0033 0.0041 0.9079 -
4.2449 28250 0.0027 0.0040 0.9078 -
4.2825 28500 0.0024 0.0040 0.9059 -
4.3201 28750 0.0026 0.0040 0.9084 -
4.3576 29000 0.0021 0.0039 0.9101 -
4.3952 29250 0.0024 0.0040 0.9081 -
4.4328 29500 0.0024 0.0039 0.9128 -
4.4703 29750 0.0027 0.0039 0.9067 -
4.5079 30000 0.003 0.0038 0.9120 -
4.5455 30250 0.0024 0.0037 0.9140 -
4.5830 30500 0.0025 0.0037 0.9116 -
4.6206 30750 0.0023 0.0037 0.9124 -
4.6582 31000 0.0026 0.0036 0.9161 -
4.6957 31250 0.0021 0.0036 0.9155 -
4.7333 31500 0.0025 0.0035 0.9147 -
4.7708 31750 0.0023 0.0035 0.9171 -
4.8084 32000 0.0024 0.0035 0.9153 -
4.8460 32250 0.002 0.0035 0.9153 -
4.8835 32500 0.0025 0.0034 0.9173 -
4.9211 32750 0.0018 0.0035 0.9180 -
4.9587 33000 0.0021 0.0035 0.9201 -
4.9962 33250 0.0019 0.0035 0.9205 -
5.0338 33500 0.0016 0.0034 0.9223 -
5.0714 33750 0.0016 0.0034 0.9217 -
5.1089 34000 0.0015 0.0033 0.9208 -
5.1465 34250 0.002 0.0034 0.9234 -
5.1841 34500 0.0017 0.0033 0.9212 -
5.2216 34750 0.002 0.0033 0.9212 -
5.2592 35000 0.0015 0.0032 0.9241 -
5.2968 35250 0.002 0.0031 0.9232 -
5.3343 35500 0.0017 0.0031 0.9251 -
5.3719 35750 0.0015 0.0031 0.9256 -
5.4095 36000 0.0018 0.0031 0.9246 -
5.4470 36250 0.0015 0.0030 0.9257 -
5.4846 36500 0.0017 0.0030 0.9261 -
5.5222 36750 0.0018 0.0030 0.9251 -
5.5597 37000 0.0016 0.0030 0.9270 -
5.5973 37250 0.0016 0.0029 0.9275 -
5.6349 37500 0.0017 0.0029 0.9283 -
5.6724 37750 0.0015 0.0029 0.9277 -
5.7100 38000 0.0017 0.0029 0.9286 -
5.7476 38250 0.0015 0.0029 0.9284 -
5.7851 38500 0.0015 0.0029 0.9286 -
5.8227 38750 0.0014 0.0029 0.9287 -
5.8603 39000 0.0015 0.0028 0.9290 -
5.8978 39250 0.0014 0.0028 0.9291 -
5.9354 39500 0.0014 0.0028 0.9293 -
5.9730 39750 0.0015 0.0028 0.9293 -
6.0 39930 - - - 0.9283

Framework Versions

  • Python: 3.10.12
  • Sentence Transformers: 3.3.1
  • Transformers: 4.47.1
  • PyTorch: 2.2.2+cu121
  • Accelerate: 1.2.1
  • Datasets: 3.2.0
  • Tokenizers: 0.21.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}
Downloads last month
22
Safetensors
Model size
278M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for yahyaabd/allstats-semantic-search-model-v1-2

Dataset used to train yahyaabd/allstats-semantic-search-model-v1-2

Evaluation results

  • Pearson Cosine on allstats semantic search v1 2 dev
    self-reported
    0.992
  • Spearman Cosine on allstats semantic search v1 2 dev
    self-reported
    0.929
  • Pearson Cosine on allstat semantic search v1 test
    self-reported
    0.993
  • Spearman Cosine on allstat semantic search v1 test
    self-reported
    0.928