QueryRouter / README.md
ManishThota's picture
Add new SentenceTransformer model.
5245a0b verified
metadata
base_model: sentence-transformers/all-MiniLM-L6-v2
datasets: []
language: []
library_name: sentence-transformers
metrics:
  - pearson_cosine
  - spearman_cosine
  - pearson_manhattan
  - spearman_manhattan
  - pearson_euclidean
  - spearman_euclidean
  - pearson_dot
  - spearman_dot
  - pearson_max
  - spearman_max
pipeline_tag: sentence-similarity
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:724
  - loss:CoSENTLoss
widget:
  - source_sentence: Financials
    sentences:
      - What is the financial performance of ABC?
      - What companies operate in the same space as ABC?
      - What standards are used to evaluate the industry?
  - source_sentence: Research
    sentences:
      - What recent studies have been conducted on ABC?
      - What are the key factors considered in rating ABC?
      - How is the rating framework applied to the sector?
  - source_sentence: Criteria
    sentences:
      - >-
        What are the projected economic impacts of inflation on the technology
        industry?
      - What is the process for assessing the creditworthiness of ABC?
      - What are the primary ESG challenges faced by ABC?
  - source_sentence: Financials
    sentences:
      - Can you list the strengths and weaknesses of ABC?
      - What is understood by the term sovereign risk?
      - Can you provide the financial history of ABC?
  - source_sentence: Research
    sentences:
      - >-
        What macroeconomic trends are influencing the credit ratings of the
        automotive industry?
      - Who are the main rivals of ABC?
      - Can you provide the latest research insights on ABC?
model-index:
  - name: SentenceTransformer based on sentence-transformers/all-MiniLM-L6-v2
    results:
      - task:
          type: semantic-similarity
          name: Semantic Similarity
        dataset:
          name: sts dev
          type: sts-dev
        metrics:
          - type: pearson_cosine
            value: .nan
            name: Pearson Cosine
          - type: spearman_cosine
            value: .nan
            name: Spearman Cosine
          - type: pearson_manhattan
            value: .nan
            name: Pearson Manhattan
          - type: spearman_manhattan
            value: .nan
            name: Spearman Manhattan
          - type: pearson_euclidean
            value: .nan
            name: Pearson Euclidean
          - type: spearman_euclidean
            value: .nan
            name: Spearman Euclidean
          - type: pearson_dot
            value: .nan
            name: Pearson Dot
          - type: spearman_dot
            value: .nan
            name: Spearman Dot
          - type: pearson_max
            value: .nan
            name: Pearson Max
          - type: spearman_max
            value: .nan
            name: Spearman Max

SentenceTransformer based on sentence-transformers/all-MiniLM-L6-v2

This is a sentence-transformers model finetuned from sentence-transformers/all-MiniLM-L6-v2. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: sentence-transformers/all-MiniLM-L6-v2
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 384 tokens
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("ManishThota/QueryRouter")
# Run inference
sentences = [
    'Research',
    'Can you provide the latest research insights on ABC?',
    'Who are the main rivals of ABC?',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Semantic Similarity

Metric Value
pearson_cosine nan
spearman_cosine nan
pearson_manhattan nan
spearman_manhattan nan
pearson_euclidean nan
spearman_euclidean nan
pearson_dot nan
spearman_dot nan
pearson_max nan
spearman_max nan

Training Details

Training Dataset

Unnamed Dataset

  • Size: 724 training samples
  • Columns: sentence1, sentence2, and score
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2 score
    type string string float
    details
    • min: 3 tokens
    • mean: 3.27 tokens
    • max: 4 tokens
    • min: 9 tokens
    • mean: 14.23 tokens
    • max: 29 tokens
    • min: 1.0
    • mean: 1.0
    • max: 1.0
  • Samples:
    sentence1 sentence2 score
    Rating What rating does XYZ have? 1.0
    Rating Can you provide the rating for XYZ? 1.0
    Rating How is XYZ rated? 1.0
  • Loss: CoSENTLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "pairwise_cos_sim"
    }
    

Evaluation Dataset

Unnamed Dataset

  • Size: 60 evaluation samples
  • Columns: sentence1, sentence2, and score
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2 score
    type string string float
    details
    • min: 3 tokens
    • mean: 3.25 tokens
    • max: 4 tokens
    • min: 9 tokens
    • mean: 12.48 tokens
    • max: 20 tokens
    • min: 1.0
    • mean: 1.0
    • max: 1.0
  • Samples:
    sentence1 sentence2 score
    Rating What is the current rating of ABC? 1.0
    Rating Can you tell me the rating for ABC? 1.0
    Rating What rating has ABC been assigned? 1.0
  • Loss: CoSENTLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "pairwise_cos_sim"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • learning_rate: 2e-05
  • num_train_epochs: 10
  • warmup_ratio: 0.1
  • save_only_model: True
  • seed: 33
  • fp16: True
  • load_best_model_at_end: True

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 8
  • per_device_eval_batch_size: 8
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 10
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: True
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 33
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional

Training Logs

Click to expand
Epoch Step Training Loss loss sts-dev_spearman_cosine
0.0220 2 - 0.0 nan
0.0440 4 - 0.0 nan
0.0659 6 - 0.0 nan
0.0879 8 - 0.0 nan
0.1099 10 - 0.0 nan
0.1319 12 - 0.0 nan
0.1538 14 - 0.0 nan
0.1758 16 - 0.0 nan
0.1978 18 - 0.0 nan
0.2198 20 - 0.0 nan
0.2418 22 - 0.0 nan
0.2637 24 - 0.0 nan
0.2857 26 - 0.0 nan
0.3077 28 - 0.0 nan
0.3297 30 - 0.0 nan
0.3516 32 - 0.0 nan
0.3736 34 - 0.0 nan
0.3956 36 - 0.0 nan
0.4176 38 - 0.0 nan
0.4396 40 - 0.0 nan
0.4615 42 - 0.0 nan
0.4835 44 - 0.0 nan
0.5055 46 - 0.0 nan
0.5275 48 - 0.0 nan
0.5495 50 - 0.0 nan
0.5714 52 - 0.0 nan
0.5934 54 - 0.0 nan
0.6154 56 - 0.0 nan
0.6374 58 - 0.0 nan
0.6593 60 - 0.0 nan
0.6813 62 - 0.0 nan
0.7033 64 - 0.0 nan
0.7253 66 - 0.0 nan
0.7473 68 - 0.0 nan
0.7692 70 - 0.0 nan
0.7912 72 - 0.0 nan
0.8132 74 - 0.0 nan
0.8352 76 - 0.0 nan
0.8571 78 - 0.0 nan
0.8791 80 - 0.0 nan
0.9011 82 - 0.0 nan
0.9231 84 - 0.0 nan
0.9451 86 - 0.0 nan
0.9670 88 - 0.0 nan
0.9890 90 - 0.0 nan
1.0110 92 - 0.0 nan
1.0330 94 - 0.0 nan
1.0549 96 - 0.0 nan
1.0769 98 - 0.0 nan
1.0989 100 - 0.0 nan
1.1209 102 - 0.0 nan
1.1429 104 - 0.0 nan
1.1648 106 - 0.0 nan
1.1868 108 - 0.0 nan
1.2088 110 - 0.0 nan
1.2308 112 - 0.0 nan
1.2527 114 - 0.0 nan
1.2747 116 - 0.0 nan
1.2967 118 - 0.0 nan
1.3187 120 - 0.0 nan
1.3407 122 - 0.0 nan
1.3626 124 - 0.0 nan
1.3846 126 - 0.0 nan
1.4066 128 - 0.0 nan
1.4286 130 - 0.0 nan
1.4505 132 - 0.0 nan
1.4725 134 - 0.0 nan
1.4945 136 - 0.0 nan
1.5165 138 - 0.0 nan
1.5385 140 - 0.0 nan
1.5604 142 - 0.0 nan
1.5824 144 - 0.0 nan
1.6044 146 - 0.0 nan
1.6264 148 - 0.0 nan
1.6484 150 - 0.0 nan
1.6703 152 - 0.0 nan
1.6923 154 - 0.0 nan
1.7143 156 - 0.0 nan
1.7363 158 - 0.0 nan
1.7582 160 - 0.0 nan
1.7802 162 - 0.0 nan
1.8022 164 - 0.0 nan
1.8242 166 - 0.0 nan
1.8462 168 - 0.0 nan
1.8681 170 - 0.0 nan
1.8901 172 - 0.0 nan
1.9121 174 - 0.0 nan
1.9341 176 - 0.0 nan
1.9560 178 - 0.0 nan
1.9780 180 - 0.0 nan
2.0 182 - 0.0 nan
2.0220 184 - 0.0 nan
2.0440 186 - 0.0 nan
2.0659 188 - 0.0 nan
2.0879 190 - 0.0 nan
2.1099 192 - 0.0 nan
2.1319 194 - 0.0 nan
2.1538 196 - 0.0 nan
2.1758 198 - 0.0 nan
2.1978 200 - 0.0 nan
2.2198 202 - 0.0 nan
2.2418 204 - 0.0 nan
2.2637 206 - 0.0 nan
2.2857 208 - 0.0 nan
2.3077 210 - 0.0 nan
2.3297 212 - 0.0 nan
2.3516 214 - 0.0 nan
2.3736 216 - 0.0 nan
2.3956 218 - 0.0 nan
2.4176 220 - 0.0 nan
2.4396 222 - 0.0 nan
2.4615 224 - 0.0 nan
2.4835 226 - 0.0 nan
2.5055 228 - 0.0 nan
2.5275 230 - 0.0 nan
2.5495 232 - 0.0 nan
2.5714 234 - 0.0 nan
2.5934 236 - 0.0 nan
2.6154 238 - 0.0 nan
2.6374 240 - 0.0 nan
2.6593 242 - 0.0 nan
2.6813 244 - 0.0 nan
2.7033 246 - 0.0 nan
2.7253 248 - 0.0 nan
2.7473 250 - 0.0 nan
2.7692 252 - 0.0 nan
2.7912 254 - 0.0 nan
2.8132 256 - 0.0 nan
2.8352 258 - 0.0 nan
2.8571 260 - 0.0 nan
2.8791 262 - 0.0 nan
2.9011 264 - 0.0 nan
2.9231 266 - 0.0 nan
2.9451 268 - 0.0 nan
2.9670 270 - 0.0 nan
2.9890 272 - 0.0 nan
3.0110 274 - 0.0 nan
3.0330 276 - 0.0 nan
3.0549 278 - 0.0 nan
3.0769 280 - 0.0 nan
3.0989 282 - 0.0 nan
3.1209 284 - 0.0 nan
3.1429 286 - 0.0 nan
3.1648 288 - 0.0 nan
3.1868 290 - 0.0 nan
3.2088 292 - 0.0 nan
3.2308 294 - 0.0 nan
3.2527 296 - 0.0 nan
3.2747 298 - 0.0 nan
3.2967 300 - 0.0 nan
3.3187 302 - 0.0 nan
3.3407 304 - 0.0 nan
3.3626 306 - 0.0 nan
3.3846 308 - 0.0 nan
3.4066 310 - 0.0 nan
3.4286 312 - 0.0 nan
3.4505 314 - 0.0 nan
3.4725 316 - 0.0 nan
3.4945 318 - 0.0 nan
3.5165 320 - 0.0 nan
3.5385 322 - 0.0 nan
3.5604 324 - 0.0 nan
3.5824 326 - 0.0 nan
3.6044 328 - 0.0 nan
3.6264 330 - 0.0 nan
3.6484 332 - 0.0 nan
3.6703 334 - 0.0 nan
3.6923 336 - 0.0 nan
3.7143 338 - 0.0 nan
3.7363 340 - 0.0 nan
3.7582 342 - 0.0 nan
3.7802 344 - 0.0 nan
3.8022 346 - 0.0 nan
3.8242 348 - 0.0 nan
3.8462 350 - 0.0 nan
3.8681 352 - 0.0 nan
3.8901 354 - 0.0 nan
3.9121 356 - 0.0 nan
3.9341 358 - 0.0 nan
3.9560 360 - 0.0 nan
3.9780 362 - 0.0 nan
4.0 364 - 0.0 nan
4.0220 366 - 0.0 nan
4.0440 368 - 0.0 nan
4.0659 370 - 0.0 nan
4.0879 372 - 0.0 nan
4.1099 374 - 0.0 nan
4.1319 376 - 0.0 nan
4.1538 378 - 0.0 nan
4.1758 380 - 0.0 nan
4.1978 382 - 0.0 nan
4.2198 384 - 0.0 nan
4.2418 386 - 0.0 nan
4.2637 388 - 0.0 nan
4.2857 390 - 0.0 nan
4.3077 392 - 0.0 nan
4.3297 394 - 0.0 nan
4.3516 396 - 0.0 nan
4.3736 398 - 0.0 nan
4.3956 400 - 0.0 nan
4.4176 402 - 0.0 nan
4.4396 404 - 0.0 nan
4.4615 406 - 0.0 nan
4.4835 408 - 0.0 nan
4.5055 410 - 0.0 nan
4.5275 412 - 0.0 nan
4.5495 414 - 0.0 nan
4.5714 416 - 0.0 nan
4.5934 418 - 0.0 nan
4.6154 420 - 0.0 nan
4.6374 422 - 0.0 nan
4.6593 424 - 0.0 nan
4.6813 426 - 0.0 nan
4.7033 428 - 0.0 nan
4.7253 430 - 0.0 nan
4.7473 432 - 0.0 nan
4.7692 434 - 0.0 nan
4.7912 436 - 0.0 nan
4.8132 438 - 0.0 nan
4.8352 440 - 0.0 nan
4.8571 442 - 0.0 nan
4.8791 444 - 0.0 nan
4.9011 446 - 0.0 nan
4.9231 448 - 0.0 nan
4.9451 450 - 0.0 nan
4.9670 452 - 0.0 nan
4.9890 454 - 0.0 nan
5.0110 456 - 0.0 nan
5.0330 458 - 0.0 nan
5.0549 460 - 0.0 nan
5.0769 462 - 0.0 nan
5.0989 464 - 0.0 nan
5.1209 466 - 0.0 nan
5.1429 468 - 0.0 nan
5.1648 470 - 0.0 nan
5.1868 472 - 0.0 nan
5.2088 474 - 0.0 nan
5.2308 476 - 0.0 nan
5.2527 478 - 0.0 nan
5.2747 480 - 0.0 nan
5.2967 482 - 0.0 nan
5.3187 484 - 0.0 nan
5.3407 486 - 0.0 nan
5.3626 488 - 0.0 nan
5.3846 490 - 0.0 nan
5.4066 492 - 0.0 nan
5.4286 494 - 0.0 nan
5.4505 496 - 0.0 nan
5.4725 498 - 0.0 nan
5.4945 500 0.0 0.0 nan
5.5165 502 - 0.0 nan
5.5385 504 - 0.0 nan
5.5604 506 - 0.0 nan
5.5824 508 - 0.0 nan
5.6044 510 - 0.0 nan
5.6264 512 - 0.0 nan
5.6484 514 - 0.0 nan
5.6703 516 - 0.0 nan
5.6923 518 - 0.0 nan
5.7143 520 - 0.0 nan
5.7363 522 - 0.0 nan
5.7582 524 - 0.0 nan
5.7802 526 - 0.0 nan
5.8022 528 - 0.0 nan
5.8242 530 - 0.0 nan
5.8462 532 - 0.0 nan
5.8681 534 - 0.0 nan
5.8901 536 - 0.0 nan
5.9121 538 - 0.0 nan
5.9341 540 - 0.0 nan
5.9560 542 - 0.0 nan
5.9780 544 - 0.0 nan
6.0 546 - 0.0 nan
6.0220 548 - 0.0 nan
6.0440 550 - 0.0 nan
6.0659 552 - 0.0 nan
6.0879 554 - 0.0 nan
6.1099 556 - 0.0 nan
6.1319 558 - 0.0 nan
6.1538 560 - 0.0 nan
6.1758 562 - 0.0 nan
6.1978 564 - 0.0 nan
6.2198 566 - 0.0 nan
6.2418 568 - 0.0 nan
6.2637 570 - 0.0 nan
6.2857 572 - 0.0 nan
6.3077 574 - 0.0 nan
6.3297 576 - 0.0 nan
6.3516 578 - 0.0 nan
6.3736 580 - 0.0 nan
6.3956 582 - 0.0 nan
6.4176 584 - 0.0 nan
6.4396 586 - 0.0 nan
6.4615 588 - 0.0 nan
6.4835 590 - 0.0 nan
6.5055 592 - 0.0 nan
6.5275 594 - 0.0 nan
6.5495 596 - 0.0 nan
6.5714 598 - 0.0 nan
6.5934 600 - 0.0 nan
6.6154 602 - 0.0 nan
6.6374 604 - 0.0 nan
6.6593 606 - 0.0 nan
6.6813 608 - 0.0 nan
6.7033 610 - 0.0 nan
6.7253 612 - 0.0 nan
6.7473 614 - 0.0 nan
6.7692 616 - 0.0 nan
6.7912 618 - 0.0 nan
6.8132 620 - 0.0 nan
6.8352 622 - 0.0 nan
6.8571 624 - 0.0 nan
6.8791 626 - 0.0 nan
6.9011 628 - 0.0 nan
6.9231 630 - 0.0 nan
6.9451 632 - 0.0 nan
6.9670 634 - 0.0 nan
6.9890 636 - 0.0 nan
7.0110 638 - 0.0 nan
7.0330 640 - 0.0 nan
7.0549 642 - 0.0 nan
7.0769 644 - 0.0 nan
7.0989 646 - 0.0 nan
7.1209 648 - 0.0 nan
7.1429 650 - 0.0 nan
7.1648 652 - 0.0 nan
7.1868 654 - 0.0 nan
7.2088 656 - 0.0 nan
7.2308 658 - 0.0 nan
7.2527 660 - 0.0 nan
7.2747 662 - 0.0 nan
7.2967 664 - 0.0 nan
7.3187 666 - 0.0 nan
7.3407 668 - 0.0 nan
7.3626 670 - 0.0 nan
7.3846 672 - 0.0 nan
7.4066 674 - 0.0 nan
7.4286 676 - 0.0 nan
7.4505 678 - 0.0 nan
7.4725 680 - 0.0 nan
7.4945 682 - 0.0 nan
7.5165 684 - 0.0 nan
7.5385 686 - 0.0 nan
7.5604 688 - 0.0 nan
7.5824 690 - 0.0 nan
7.6044 692 - 0.0 nan
7.6264 694 - 0.0 nan
7.6484 696 - 0.0 nan
7.6703 698 - 0.0 nan
7.6923 700 - 0.0 nan
7.7143 702 - 0.0 nan
7.7363 704 - 0.0 nan
7.7582 706 - 0.0 nan
7.7802 708 - 0.0 nan
7.8022 710 - 0.0 nan
7.8242 712 - 0.0 nan
7.8462 714 - 0.0 nan
7.8681 716 - 0.0 nan
7.8901 718 - 0.0 nan
7.9121 720 - 0.0 nan
7.9341 722 - 0.0 nan
7.9560 724 - 0.0 nan
7.9780 726 - 0.0 nan
8.0 728 - 0.0 nan
8.0220 730 - 0.0 nan
8.0440 732 - 0.0 nan
8.0659 734 - 0.0 nan
8.0879 736 - 0.0 nan
8.1099 738 - 0.0 nan
8.1319 740 - 0.0 nan
8.1538 742 - 0.0 nan
8.1758 744 - 0.0 nan
8.1978 746 - 0.0 nan
8.2198 748 - 0.0 nan
8.2418 750 - 0.0 nan
8.2637 752 - 0.0 nan
8.2857 754 - 0.0 nan
8.3077 756 - 0.0 nan
8.3297 758 - 0.0 nan
8.3516 760 - 0.0 nan
8.3736 762 - 0.0 nan
8.3956 764 - 0.0 nan
8.4176 766 - 0.0 nan
8.4396 768 - 0.0 nan
8.4615 770 - 0.0 nan
8.4835 772 - 0.0 nan
8.5055 774 - 0.0 nan
8.5275 776 - 0.0 nan
8.5495 778 - 0.0 nan
8.5714 780 - 0.0 nan
8.5934 782 - 0.0 nan
8.6154 784 - 0.0 nan
8.6374 786 - 0.0 nan
8.6593 788 - 0.0 nan
8.6813 790 - 0.0 nan
8.7033 792 - 0.0 nan
8.7253 794 - 0.0 nan
8.7473 796 - 0.0 nan
8.7692 798 - 0.0 nan
8.7912 800 - 0.0 nan
8.8132 802 - 0.0 nan
8.8352 804 - 0.0 nan
8.8571 806 - 0.0 nan
8.8791 808 - 0.0 nan
8.9011 810 - 0.0 nan
8.9231 812 - 0.0 nan
8.9451 814 - 0.0 nan
8.9670 816 - 0.0 nan
8.9890 818 - 0.0 nan
9.0110 820 - 0.0 nan
9.0330 822 - 0.0 nan
9.0549 824 - 0.0 nan
9.0769 826 - 0.0 nan
9.0989 828 - 0.0 nan
9.1209 830 - 0.0 nan
9.1429 832 - 0.0 nan
9.1648 834 - 0.0 nan
9.1868 836 - 0.0 nan
9.2088 838 - 0.0 nan
9.2308 840 - 0.0 nan
9.2527 842 - 0.0 nan
9.2747 844 - 0.0 nan
9.2967 846 - 0.0 nan
9.3187 848 - 0.0 nan
9.3407 850 - 0.0 nan
9.3626 852 - 0.0 nan
9.3846 854 - 0.0 nan
9.4066 856 - 0.0 nan
9.4286 858 - 0.0 nan
9.4505 860 - 0.0 nan
9.4725 862 - 0.0 nan
9.4945 864 - 0.0 nan
9.5165 866 - 0.0 nan
9.5385 868 - 0.0 nan
9.5604 870 - 0.0 nan
9.5824 872 - 0.0 nan
9.6044 874 - 0.0 nan
9.6264 876 - 0.0 nan
9.6484 878 - 0.0 nan
9.6703 880 - 0.0 nan
9.6923 882 - 0.0 nan
9.7143 884 - 0.0 nan
9.7363 886 - 0.0 nan
9.7582 888 - 0.0 nan
9.7802 890 - 0.0 nan
9.8022 892 - 0.0 nan
9.8242 894 - 0.0 nan
9.8462 896 - 0.0 nan
9.8681 898 - 0.0 nan
9.8901 900 - 0.0 nan
9.9121 902 - 0.0 nan
9.9341 904 - 0.0 nan
9.9560 906 - 0.0 nan
9.9780 908 - 0.0 nan
10.0 910 - 0.0 nan
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.10.12
  • Sentence Transformers: 3.0.1
  • Transformers: 4.41.2
  • PyTorch: 2.0.1+cu118
  • Accelerate: 0.31.0
  • Datasets: 2.20.0
  • Tokenizers: 0.19.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

CoSENTLoss

@online{kexuefm-8847,
    title={CoSENT: A more efficient sentence vector scheme than Sentence-BERT},
    author={Su Jianlin},
    year={2022},
    month={Jan},
    url={https://kexue.fm/archives/8847},
}