SentenceTransformer based on FacebookAI/xlm-roberta-base

This is a sentence-transformers model finetuned from FacebookAI/xlm-roberta-base. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: FacebookAI/xlm-roberta-base
  • Maximum Sequence Length: 128 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: XLMRobertaModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("luanafelbarros/xlm-roberta-base-multilingual-mkqa")
# Run inference
sentences = [
    'where does food wars anime end in the manga',
    '《食戟之靈》漫畫幾時完',
    'zh_hk',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Knowledge Distillation

  • Datasets: MSE-val-en-to-ar, MSE-val-en-to-da, MSE-val-en-to-de, MSE-val-en-to-en, MSE-val-en-to-es, MSE-val-en-to-fi, MSE-val-en-to-fr, MSE-val-en-to-he, MSE-val-en-to-hu, MSE-val-en-to-it, MSE-val-en-to-ja, MSE-val-en-to-ko, MSE-val-en-to-km, MSE-val-en-to-ms, MSE-val-en-to-nl, MSE-val-en-to-no, MSE-val-en-to-pl, MSE-val-en-to-pt, MSE-val-en-to-ru, MSE-val-en-to-sv, MSE-val-en-to-th, MSE-val-en-to-tr, MSE-val-en-to-vi, MSE-val-en-to-zh_cn, MSE-val-en-to-zh_hk and MSE-val-en-to-zh_tw
  • Evaluated with MSEEvaluator
Metric MSE-val-en-to-ar MSE-val-en-to-da MSE-val-en-to-de MSE-val-en-to-en MSE-val-en-to-es MSE-val-en-to-fi MSE-val-en-to-fr MSE-val-en-to-he MSE-val-en-to-hu MSE-val-en-to-it MSE-val-en-to-ja MSE-val-en-to-ko MSE-val-en-to-km MSE-val-en-to-ms MSE-val-en-to-nl MSE-val-en-to-no MSE-val-en-to-pl MSE-val-en-to-pt MSE-val-en-to-ru MSE-val-en-to-sv MSE-val-en-to-th MSE-val-en-to-tr MSE-val-en-to-vi MSE-val-en-to-zh_cn MSE-val-en-to-zh_hk MSE-val-en-to-zh_tw
negative_mse -19.9351 -16.2271 -17.0315 -14.7466 -16.739 -17.6995 -16.8551 -19.1143 -17.8625 -16.9311 -18.7746 -19.6834 -19.3393 -16.4985 -15.9824 -16.2615 -17.5108 -16.5283 -17.3583 -16.3128 -17.5869 -17.3905 -17.175 -18.1255 -18.1899 -18.6787

Training Details

Training Dataset

Unnamed Dataset

  • Size: 234,000 training samples
  • Columns: english, non-english, target, and label
  • Approximate statistics based on the first 1000 samples:
    english non-english target label
    type string string string list
    details
    • min: 10 tokens
    • mean: 13.21 tokens
    • max: 19 tokens
    • min: 7 tokens
    • mean: 13.87 tokens
    • max: 31 tokens
    • min: 3 tokens
    • mean: 3.38 tokens
    • max: 6 tokens
    • size: 768 elements
  • Samples:
    english non-english target label
    what are all the wizard of oz movies the wizard of oz ما هي كل افلام ar [0.5303382277488708, -0.31762194633483887, -0.2945275902748108, -0.6602655649185181, -1.4617066383361816, ...]
    what are all the wizard of oz movies hvad er alle troldmanden fra oz filmene da [0.5303382277488708, -0.31762194633483887, -0.2945275902748108, -0.6602655649185181, -1.4617066383361816, ...]
    what are all the wizard of oz movies Wie heißen alle Der Zauberer von Oz Filme de [0.5303382277488708, -0.31762194633483887, -0.2945275902748108, -0.6602655649185181, -1.4617066383361816, ...]
  • Loss: MSELoss

Evaluation Dataset

Unnamed Dataset

  • Size: 13,000 evaluation samples
  • Columns: english, non-english, target, and label
  • Approximate statistics based on the first 1000 samples:
    english non-english target label
    type string string string list
    details
    • min: 10 tokens
    • mean: 13.05 tokens
    • max: 22 tokens
    • min: 5 tokens
    • mean: 13.79 tokens
    • max: 34 tokens
    • min: 3 tokens
    • mean: 3.38 tokens
    • max: 6 tokens
    • size: 768 elements
  • Samples:
    english non-english target label
    a change to the constitution must be approved by يجب الموافقة على تغيير الدستور ar [1.0918692350387573, 0.8024187684059143, 0.23035858571529388, 0.16300565004348755, -0.6033854484558105, ...]
    a change to the constitution must be approved by en ændring af forfatningen skal godkendes af da [1.0918692350387573, 0.8024187684059143, 0.23035858571529388, 0.16300565004348755, -0.6033854484558105, ...]
    a change to the constitution must be approved by Eine Änderung der Verfassung muss gebilligt werden durch de [1.0918692350387573, 0.8024187684059143, 0.23035858571529388, 0.16300565004348755, -0.6033854484558105, ...]
  • Loss: MSELoss

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 64
  • per_device_eval_batch_size: 64
  • learning_rate: 2e-05
  • warmup_ratio: 0.1
  • fp16: True

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 64
  • per_device_eval_batch_size: 64
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 3
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional

Training Logs

Click to expand
Epoch Step Training Loss Validation Loss MSE-val-en-to-ar_negative_mse MSE-val-en-to-da_negative_mse MSE-val-en-to-de_negative_mse MSE-val-en-to-en_negative_mse MSE-val-en-to-es_negative_mse MSE-val-en-to-fi_negative_mse MSE-val-en-to-fr_negative_mse MSE-val-en-to-he_negative_mse MSE-val-en-to-hu_negative_mse MSE-val-en-to-it_negative_mse MSE-val-en-to-ja_negative_mse MSE-val-en-to-ko_negative_mse MSE-val-en-to-km_negative_mse MSE-val-en-to-ms_negative_mse MSE-val-en-to-nl_negative_mse MSE-val-en-to-no_negative_mse MSE-val-en-to-pl_negative_mse MSE-val-en-to-pt_negative_mse MSE-val-en-to-ru_negative_mse MSE-val-en-to-sv_negative_mse MSE-val-en-to-th_negative_mse MSE-val-en-to-tr_negative_mse MSE-val-en-to-vi_negative_mse MSE-val-en-to-zh_cn_negative_mse MSE-val-en-to-zh_hk_negative_mse MSE-val-en-to-zh_tw_negative_mse
0.0273 100 0.7471 - - - - - - - - - - - - - - - - - - - - - - - - - - -
0.0547 200 0.5344 - - - - - - - - - - - - - - - - - - - - - - - - - - -
0.0820 300 0.4011 - - - - - - - - - - - - - - - - - - - - - - - - - - -
0.1094 400 0.3686 - - - - - - - - - - - - - - - - - - - - - - - - - - -
0.1367 500 0.3558 - - - - - - - - - - - - - - - - - - - - - - - - - - -
0.1641 600 0.3527 - - - - - - - - - - - - - - - - - - - - - - - - - - -
0.1914 700 0.3479 - - - - - - - - - - - - - - - - - - - - - - - - - - -
0.2188 800 0.3373 - - - - - - - - - - - - - - - - - - - - - - - - - - -
0.2461 900 0.3315 - - - - - - - - - - - - - - - - - - - - - - - - - - -
0.2734 1000 0.3243 0.3143 -31.0036 -30.4995 -30.5974 -30.3236 -30.5190 -30.6680 -30.5902 -30.8805 -30.7873 -30.6191 -30.7149 -30.7932 -30.8955 -30.5254 -30.5554 -30.5243 -30.6522 -30.5353 -30.5800 -30.5240 -30.7348 -30.7127 -30.6429 -30.5608 -30.5626 -30.5837
0.3008 1100 0.3175 - - - - - - - - - - - - - - - - - - - - - - - - - - -
0.3281 1200 0.3126 - - - - - - - - - - - - - - - - - - - - - - - - - - -
0.3555 1300 0.3082 - - - - - - - - - - - - - - - - - - - - - - - - - - -
0.3828 1400 0.3049 - - - - - - - - - - - - - - - - - - - - - - - - - - -
0.4102 1500 0.3019 - - - - - - - - - - - - - - - - - - - - - - - - - - -
0.4375 1600 0.2988 - - - - - - - - - - - - - - - - - - - - - - - - - - -
0.4649 1700 0.2979 - - - - - - - - - - - - - - - - - - - - - - - - - - -
0.4922 1800 0.2926 - - - - - - - - - - - - - - - - - - - - - - - - - - -
0.5196 1900 0.2885 - - - - - - - - - - - - - - - - - - - - - - - - - - -
0.5469 2000 0.2879 0.2787 -26.4435 -25.3475 -25.5656 -24.8280 -25.4096 -25.8103 -25.4399 -26.1209 -25.8292 -25.5216 -26.0866 -26.4725 -26.2586 -25.5986 -25.3495 -25.2907 -25.6509 -25.3489 -25.4795 -25.3660 -25.7628 -25.7572 -25.6763 -25.7273 -25.7893 -25.8524
0.5742 2100 0.2843 - - - - - - - - - - - - - - - - - - - - - - - - - - -
0.6016 2200 0.2821 - - - - - - - - - - - - - - - - - - - - - - - - - - -
0.6289 2300 0.2795 - - - - - - - - - - - - - - - - - - - - - - - - - - -
0.6563 2400 0.2808 - - - - - - - - - - - - - - - - - - - - - - - - - - -
0.6836 2500 0.2771 - - - - - - - - - - - - - - - - - - - - - - - - - - -
0.7110 2600 0.2745 - - - - - - - - - - - - - - - - - - - - - - - - - - -
0.7383 2700 0.272 - - - - - - - - - - - - - - - - - - - - - - - - - - -
0.7657 2800 0.2711 - - - - - - - - - - - - - - - - - - - - - - - - - - -
0.7930 2900 0.2685 - - - - - - - - - - - - - - - - - - - - - - - - - - -
0.8203 3000 0.267 0.2638 -24.2447 -22.3985 -22.7542 -21.5879 -22.5929 -23.2891 -22.6798 -23.7047 -23.1739 -22.7708 -23.5962 -24.2250 -23.9269 -22.8039 -22.2681 -22.3432 -22.9390 -22.5717 -22.8201 -22.4143 -23.1236 -23.1100 -22.9658 -23.0786 -23.2390 -23.3243
0.8477 3100 0.2718 - - - - - - - - - - - - - - - - - - - - - - - - - - -
0.8750 3200 0.2674 - - - - - - - - - - - - - - - - - - - - - - - - - - -
0.9024 3300 0.2662 - - - - - - - - - - - - - - - - - - - - - - - - - - -
0.9297 3400 0.2631 - - - - - - - - - - - - - - - - - - - - - - - - - - -
0.9571 3500 0.26 - - - - - - - - - - - - - - - - - - - - - - - - - - -
0.9844 3600 0.2586 - - - - - - - - - - - - - - - - - - - - - - - - - - -
1.0118 3700 0.2575 - - - - - - - - - - - - - - - - - - - - - - - - - - -
1.0391 3800 0.2549 - - - - - - - - - - - - - - - - - - - - - - - - - - -
1.0664 3900 0.2529 - - - - - - - - - - - - - - - - - - - - - - - - - - -
1.0938 4000 0.2511 0.2469 -22.9347 -20.4196 -20.9011 -19.3762 -20.7242 -21.5322 -20.7711 -22.3208 -21.5176 -20.9047 -22.1008 -22.8701 -22.4827 -20.7383 -20.2571 -20.3842 -21.1960 -20.6791 -21.0474 -20.4460 -21.3999 -21.3937 -21.1382 -21.5265 -21.6918 -21.8791
1.1211 4100 0.2502 - - - - - - - - - - - - - - - - - - - - - - - - - - -
1.1485 4200 0.2491 - - - - - - - - - - - - - - - - - - - - - - - - - - -
1.1758 4300 0.248 - - - - - - - - - - - - - - - - - - - - - - - - - - -
1.2032 4400 0.2463 - - - - - - - - - - - - - - - - - - - - - - - - - - -
1.2305 4500 0.2445 - - - - - - - - - - - - - - - - - - - - - - - - - - -
1.2579 4600 0.2432 - - - - - - - - - - - - - - - - - - - - - - - - - - -
1.2852 4700 0.2419 - - - - - - - - - - - - - - - - - - - - - - - - - - -
1.3126 4800 0.2405 - - - - - - - - - - - - - - - - - - - - - - - - - - -
1.3399 4900 0.2404 - - - - - - - - - - - - - - - - - - - - - - - - - - -
1.3672 5000 0.2394 0.2354 -21.7963 -18.8622 -19.4636 -17.6703 -19.2473 -20.1437 -19.3378 -21.1200 -20.1560 -19.4587 -20.9473 -21.6343 -21.2979 -19.1964 -18.6653 -18.8517 -19.8565 -19.1500 -19.6760 -18.9243 -19.9718 -19.9191 -19.6695 -20.2707 -20.4090 -20.6846
1.3946 5100 0.2375 - - - - - - - - - - - - - - - - - - - - - - - - - - -
1.4219 5200 0.2374 - - - - - - - - - - - - - - - - - - - - - - - - - - -
1.4493 5300 0.236 - - - - - - - - - - - - - - - - - - - - - - - - - - -
1.4766 5400 0.2335 - - - - - - - - - - - - - - - - - - - - - - - - - - -
1.5040 5500 0.2346 - - - - - - - - - - - - - - - - - - - - - - - - - - -
1.5313 5600 0.2335 - - - - - - - - - - - - - - - - - - - - - - - - - - -
1.5587 5700 0.232 - - - - - - - - - - - - - - - - - - - - - - - - - - -
1.5860 5800 0.2314 - - - - - - - - - - - - - - - - - - - - - - - - - - -
1.6133 5900 0.2304 - - - - - - - - - - - - - - - - - - - - - - - - - - -
1.6407 6000 0.2303 0.2289 -21.1967 -17.9192 -18.5833 -16.6276 -18.3510 -19.2977 -18.4551 -20.3960 -19.3202 -18.5573 -20.1420 -20.9358 -20.6084 -18.2396 -17.7261 -17.9322 -19.0167 -18.2305 -18.8471 -17.9794 -19.1440 -19.0105 -18.7845 -19.4778 -19.6095 -19.9643
1.6680 6100 0.2294 - - - - - - - - - - - - - - - - - - - - - - - - - - -
1.6954 6200 0.229 - - - - - - - - - - - - - - - - - - - - - - - - - - -
1.7227 6300 0.2275 - - - - - - - - - - - - - - - - - - - - - - - - - - -
1.7501 6400 0.2285 - - - - - - - - - - - - - - - - - - - - - - - - - - -
1.7774 6500 0.2279 - - - - - - - - - - - - - - - - - - - - - - - - - - -
1.8048 6600 0.2275 - - - - - - - - - - - - - - - - - - - - - - - - - - -
1.8321 6700 0.2256 - - - - - - - - - - - - - - - - - - - - - - - - - - -
1.8594 6800 0.2259 - - - - - - - - - - - - - - - - - - - - - - - - - - -
1.8868 6900 0.2237 - - - - - - - - - - - - - - - - - - - - - - - - - - -
1.9141 7000 0.2232 0.2242 -20.6888 -17.2295 -17.9547 -15.8517 -17.7267 -18.6854 -17.8191 -19.8853 -18.7432 -17.9054 -19.5866 -20.4321 -20.1381 -17.5215 -16.9982 -17.2683 -18.4340 -17.5295 -18.2454 -17.3006 -18.5072 -18.3554 -18.1438 -18.9634 -19.0843 -19.4826
1.9415 7100 0.2231 - - - - - - - - - - - - - - - - - - - - - - - - - - -
1.9688 7200 0.2225 - - - - - - - - - - - - - - - - - - - - - - - - - - -
1.9962 7300 0.2235 - - - - - - - - - - - - - - - - - - - - - - - - - - -
2.0235 7400 0.2224 - - - - - - - - - - - - - - - - - - - - - - - - - - -
2.0509 7500 0.2206 - - - - - - - - - - - - - - - - - - - - - - - - - - -
2.0782 7600 0.2205 - - - - - - - - - - - - - - - - - - - - - - - - - - -
2.1056 7700 0.2196 - - - - - - - - - - - - - - - - - - - - - - - - - - -
2.1329 7800 0.22 - - - - - - - - - - - - - - - - - - - - - - - - - - -
2.1602 7900 0.2188 - - - - - - - - - - - - - - - - - - - - - - - - - - -
2.1876 8000 0.2184 0.2209 -20.3380 -16.7285 -17.5078 -15.3142 -17.2366 -18.1903 -17.3419 -19.5057 -18.2970 -17.4283 -19.1880 -20.0709 -19.7478 -17.0291 -16.5125 -16.7629 -17.9586 -17.0487 -17.7907 -16.8237 -18.0585 -17.8714 -17.6527 -18.5499 -18.6504 -19.0688
2.2149 8100 0.2189 - - - - - - - - - - - - - - - - - - - - - - - - - - -
2.2423 8200 0.2178 - - - - - - - - - - - - - - - - - - - - - - - - - - -
2.2696 8300 0.2185 - - - - - - - - - - - - - - - - - - - - - - - - - - -
2.2970 8400 0.2175 - - - - - - - - - - - - - - - - - - - - - - - - - - -
2.3243 8500 0.2183 - - - - - - - - - - - - - - - - - - - - - - - - - - -
2.3517 8600 0.2176 - - - - - - - - - - - - - - - - - - - - - - - - - - -
2.3790 8700 0.2169 - - - - - - - - - - - - - - - - - - - - - - - - - - -
2.4063 8800 0.2172 - - - - - - - - - - - - - - - - - - - - - - - - - - -
2.4337 8900 0.2153 - - - - - - - - - - - - - - - - - - - - - - - - - - -
2.4610 9000 0.2162 0.2187 -20.1028 -16.4147 -17.2107 -14.9595 -16.9406 -17.9101 -17.0441 -19.2680 -18.0594 -17.1276 -18.9403 -19.8407 -19.5169 -16.6976 -16.1859 -16.4554 -17.6828 -16.7360 -17.5378 -16.5167 -17.7710 -17.5853 -17.3717 -18.3032 -18.3627 -18.8466
2.4884 9100 0.2159 - - - - - - - - - - - - - - - - - - - - - - - - - - -
2.5157 9200 0.2161 - - - - - - - - - - - - - - - - - - - - - - - - - - -
2.5431 9300 0.2148 - - - - - - - - - - - - - - - - - - - - - - - - - - -
2.5704 9400 0.2148 - - - - - - - - - - - - - - - - - - - - - - - - - - -
2.5978 9500 0.2154 - - - - - - - - - - - - - - - - - - - - - - - - - - -
2.6251 9600 0.2142 - - - - - - - - - - - - - - - - - - - - - - - - - - -
2.6524 9700 0.2144 - - - - - - - - - - - - - - - - - - - - - - - - - - -
2.6798 9800 0.215 - - - - - - - - - - - - - - - - - - - - - - - - - - -
2.7071 9900 0.2142 - - - - - - - - - - - - - - - - - - - - - - - - - - -
2.7345 10000 0.2139 0.2174 -19.9351 -16.2271 -17.0315 -14.7466 -16.7390 -17.6995 -16.8551 -19.1143 -17.8625 -16.9311 -18.7746 -19.6834 -19.3393 -16.4985 -15.9824 -16.2615 -17.5108 -16.5283 -17.3583 -16.3128 -17.5869 -17.3905 -17.1750 -18.1255 -18.1899 -18.6787
2.7618 10100 0.2134 - - - - - - - - - - - - - - - - - - - - - - - - - - -
2.7892 10200 0.2141 - - - - - - - - - - - - - - - - - - - - - - - - - - -
2.8165 10300 0.2147 - - - - - - - - - - - - - - - - - - - - - - - - - - -
2.8439 10400 0.2138 - - - - - - - - - - - - - - - - - - - - - - - - - - -
2.8712 10500 0.2133 - - - - - - - - - - - - - - - - - - - - - - - - - - -
2.8986 10600 0.2129 - - - - - - - - - - - - - - - - - - - - - - - - - - -
2.9259 10700 0.2129 - - - - - - - - - - - - - - - - - - - - - - - - - - -
2.9532 10800 0.2129 - - - - - - - - - - - - - - - - - - - - - - - - - - -
2.9806 10900 0.214 - - - - - - - - - - - - - - - - - - - - - - - - - - -

Framework Versions

  • Python: 3.10.12
  • Sentence Transformers: 3.3.1
  • Transformers: 4.46.2
  • PyTorch: 2.5.1+cu121
  • Accelerate: 1.1.1
  • Datasets: 3.1.0
  • Tokenizers: 0.20.3

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MSELoss

@inproceedings{reimers-2020-multilingual-sentence-bert,
    title = "Making Monolingual Sentence Embeddings Multilingual using Knowledge Distillation",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2020",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/2004.09813",
}
Downloads last month
11
Safetensors
Model size
278M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for luanafelbarros/xlm-roberta-base-multilingual-mkqa

Finetuned
(2719)
this model

Evaluation results