SentenceTransformer based on BAAI/bge-m3
This is a sentence-transformers model finetuned from BAAI/bge-m3. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: BAAI/bge-m3
- Maximum Sequence Length: 1536 tokens
- Output Dimensionality: 1024 tokens
- Similarity Function: Cosine Similarity
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 1536, 'do_lower_case': False}) with Transformer model: XLMRobertaModel
(1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("seongil-dn/bge-m3-mrl-330")
# Run inference
sentences = [
'어떤 사람의 연금 수령액을 증가시키면 연금재정이 어려워져?',
'한편, 제19대국회에서는 소득대체율을 높이지 않는 대신, 연금급여산식의 기준이 되는 기준소득월액의 상ㆍ하한액을 인상함으로써 가입자 전체의 소득평균을 높여 보험급여를 인상하는 방안도 논의되었다. 이 방안은 소득재분배 부문에 해당하는 국민연금의 A값을 상향하여 소득재분배 기능을 강화하는 장점을 가진 반면, 보험료가 인상되는 저소득층 가입자와 영세사업장, 그리고 고소득 사업장가입자와 사업장의 연금보험료 부담이 증가하기 때문에, 경제 및 산업계의 반발로 이어질 가능성도 있다. 또한 고소득 가입자들의 연급수급액의 증가는 시간의 경과에 따라 연금재정에 추가적인 부담을 주게 된다는 것이다.',
'다. 재정<br>□ 저출산·고령화의 진전으로 세원이 되는 생산가능인구의 비중은 줄어들고, 연금급여 및 의료비 지출 등은 늘어남에 따라 재정수지 부담은 가중될 전망<br>― 출산율이 하락하면 전체 인구 중 생산가능인구의 비율이 감소하고 따라서 세수 감소로 이어질 가능성<br>― 반면, 고령화로 인해 연금수급자가 증가하면 연금 및 의료비 등의 재정지출 확대로 이어질 가능성<br>― 국민연금 가입자 중 노령연금 수급율은 인구감소 및 은퇴자 증가에 따라 2010년 13.3%, 2030년 41.9%, 2050년 88.5%로 급증할 전망<br>□ IMF에 따르면 GDP 대비 재정수지는 생산가능인구비율 1% 증가 시 0.06%p 개선되는 반면, 노인인구 1% 증가시 0.46%p 악화<br>― 또한, OECD는 고령화로 인해 노인관련 재정지출이 급증해 주요국의 2050년 재정수지가 적자를 기록할 것으로 전망',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Training Details
Training Hyperparameters
Non-Default Hyperparameters
per_device_train_batch_size
: 32gradient_accumulation_steps
: 32learning_rate
: 3e-05weight_decay
: 0.01warmup_ratio
: 0.05fp16
: Truegradient_checkpointing
: Truegradient_checkpointing_kwargs
: {'use_reentrant': False}batch_sampler
: no_duplicates
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: noprediction_loss_only
: Trueper_device_train_batch_size
: 32per_device_eval_batch_size
: 8per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 32eval_accumulation_steps
: Nonetorch_empty_cache_steps
: Nonelearning_rate
: 3e-05weight_decay
: 0.01adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1.0num_train_epochs
: 3max_steps
: -1lr_scheduler_type
: linearlr_scheduler_kwargs
: {}warmup_ratio
: 0.05warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 42data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Falsefp16
: Truefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Nonelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Truedataloader_num_workers
: 0dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Falseignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torchoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Falseresume_from_checkpoint
: Nonehub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Falsehub_always_push
: Falsegradient_checkpointing
: Truegradient_checkpointing_kwargs
: {'use_reentrant': False}include_inputs_for_metrics
: Falseeval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Nonedispatch_batches
: Nonesplit_batches
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falseeval_on_start
: Falseeval_use_gather_object
: Falsebatch_sampler
: no_duplicatesmulti_dataset_batch_sampler
: proportional
Training Logs
Click to expand
Epoch | Step | Training Loss |
---|---|---|
0.0091 | 1 | 15.81 |
0.0181 | 2 | 15.9499 |
0.0272 | 3 | 15.3393 |
0.0363 | 4 | 15.4563 |
0.0453 | 5 | 15.5322 |
0.0544 | 6 | 16.0348 |
0.0635 | 7 | 15.3445 |
0.0725 | 8 | 15.7129 |
0.0816 | 9 | 14.4393 |
0.0907 | 10 | 13.4846 |
0.0997 | 11 | 12.5233 |
0.1088 | 12 | 12.1728 |
0.1178 | 13 | 11.9232 |
0.1269 | 14 | 11.5308 |
0.1360 | 15 | 10.7525 |
0.1450 | 16 | 10.393 |
0.1541 | 17 | 9.7346 |
0.1632 | 18 | 9.4875 |
0.1722 | 19 | 9.2608 |
0.1813 | 20 | 8.7966 |
0.1904 | 21 | 8.5579 |
0.1994 | 22 | 8.4993 |
0.2085 | 23 | 8.1505 |
0.2176 | 24 | 8.5027 |
0.2266 | 25 | 7.9795 |
0.2357 | 26 | 7.5782 |
0.2448 | 27 | 7.68 |
0.2538 | 28 | 7.539 |
0.2629 | 29 | 7.5871 |
0.2720 | 30 | 7.2676 |
0.2810 | 31 | 6.9613 |
0.2901 | 32 | 6.89 |
0.2992 | 33 | 6.7585 |
0.3082 | 34 | 6.7286 |
0.3173 | 35 | 6.754 |
0.3263 | 36 | 6.7466 |
0.3354 | 37 | 6.6096 |
0.3445 | 38 | 6.5864 |
0.3535 | 39 | 6.5235 |
0.3626 | 40 | 6.5429 |
0.3717 | 41 | 6.4971 |
0.3807 | 42 | 6.4463 |
0.3898 | 43 | 6.332 |
0.3989 | 44 | 6.1275 |
0.4079 | 45 | 6.2551 |
0.4170 | 46 | 6.1372 |
0.4261 | 47 | 6.1075 |
0.4351 | 48 | 6.1408 |
0.4442 | 49 | 6.062 |
0.4533 | 50 | 5.9831 |
0.4623 | 51 | 5.9956 |
0.4714 | 52 | 5.8332 |
0.4805 | 53 | 5.7447 |
0.4895 | 54 | 5.9531 |
0.4986 | 55 | 5.911 |
0.5076 | 56 | 5.8576 |
0.5167 | 57 | 5.8116 |
0.5258 | 58 | 5.6564 |
0.5348 | 59 | 5.7289 |
0.5439 | 60 | 5.7514 |
0.5530 | 61 | 5.5991 |
0.5620 | 62 | 5.553 |
0.5711 | 63 | 5.4728 |
0.5802 | 64 | 5.6212 |
0.5892 | 65 | 5.6554 |
0.5983 | 66 | 5.4389 |
0.6074 | 67 | 5.3669 |
0.6164 | 68 | 5.5667 |
0.6255 | 69 | 5.4106 |
0.6346 | 70 | 5.3122 |
0.6436 | 71 | 5.4145 |
0.6527 | 72 | 5.3794 |
0.6618 | 73 | 5.269 |
0.6708 | 74 | 5.3583 |
0.6799 | 75 | 5.311 |
0.6890 | 76 | 5.2061 |
0.6980 | 77 | 5.133 |
0.7071 | 78 | 5.4036 |
0.7161 | 79 | 5.2761 |
0.7252 | 80 | 5.0696 |
0.7343 | 81 | 5.3648 |
0.7433 | 82 | 5.0591 |
0.7524 | 83 | 5.074 |
0.7615 | 84 | 5.1789 |
0.7705 | 85 | 5.0147 |
0.7796 | 86 | 5.251 |
0.7887 | 87 | 5.1282 |
0.7977 | 88 | 5.1111 |
0.8068 | 89 | 5.2096 |
0.8159 | 90 | 5.0734 |
0.8249 | 91 | 4.9202 |
0.8340 | 92 | 5.0058 |
0.8431 | 93 | 5.0928 |
0.8521 | 94 | 4.9845 |
0.8612 | 95 | 5.0683 |
0.8703 | 96 | 5.0267 |
0.8793 | 97 | 5.0821 |
0.8884 | 98 | 4.8806 |
0.8975 | 99 | 5.0043 |
0.9065 | 100 | 4.888 |
0.9156 | 101 | 5.0629 |
0.9246 | 102 | 5.0454 |
0.9337 | 103 | 4.9619 |
0.9428 | 104 | 4.9217 |
0.9518 | 105 | 4.7401 |
0.9609 | 106 | 4.8068 |
0.9700 | 107 | 4.8151 |
0.9790 | 108 | 4.8689 |
0.9881 | 109 | 5.0193 |
0.9972 | 110 | 4.706 |
1.0062 | 111 | 4.8057 |
1.0153 | 112 | 4.7279 |
1.0244 | 113 | 4.7721 |
1.0334 | 114 | 4.7767 |
1.0425 | 115 | 4.669 |
1.0516 | 116 | 4.8533 |
1.0606 | 117 | 4.8634 |
1.0697 | 118 | 4.9135 |
1.0788 | 119 | 4.7629 |
1.0878 | 120 | 4.7479 |
1.0969 | 121 | 4.743 |
1.1059 | 122 | 4.5606 |
1.1150 | 123 | 4.6933 |
1.1241 | 124 | 4.6659 |
1.1331 | 125 | 4.7131 |
1.1422 | 126 | 4.7059 |
1.1513 | 127 | 4.5701 |
1.1603 | 128 | 4.4892 |
1.1694 | 129 | 4.6497 |
1.1785 | 130 | 4.4814 |
1.1875 | 131 | 4.2669 |
1.1966 | 132 | 4.4983 |
1.2057 | 133 | 4.431 |
1.2147 | 134 | 4.414 |
1.2238 | 135 | 4.3975 |
1.2329 | 136 | 4.3101 |
1.2419 | 137 | 4.3422 |
1.2510 | 138 | 4.476 |
1.2601 | 139 | 4.6629 |
1.2691 | 140 | 4.3559 |
1.2782 | 141 | 4.2049 |
1.2873 | 142 | 4.303 |
1.2963 | 143 | 4.3053 |
1.3054 | 144 | 4.2366 |
1.3144 | 145 | 4.5165 |
1.3235 | 146 | 4.2634 |
1.3326 | 147 | 4.4295 |
1.3416 | 148 | 4.2595 |
1.3507 | 149 | 4.3753 |
1.3598 | 150 | 4.3454 |
1.3688 | 151 | 4.2618 |
1.3779 | 152 | 4.4016 |
1.3870 | 153 | 4.2672 |
1.3960 | 154 | 4.1824 |
1.4051 | 155 | 4.3268 |
1.4142 | 156 | 4.091 |
1.4232 | 157 | 4.3111 |
1.4323 | 158 | 4.2397 |
1.4414 | 159 | 4.1694 |
1.4504 | 160 | 4.2119 |
1.4595 | 161 | 4.1292 |
1.4686 | 162 | 4.1154 |
1.4776 | 163 | 4.1638 |
1.4867 | 164 | 4.3548 |
1.4958 | 165 | 4.2137 |
1.5048 | 166 | 4.1888 |
1.5139 | 167 | 4.2609 |
1.5229 | 168 | 4.2644 |
1.5320 | 169 | 4.2183 |
1.5411 | 170 | 4.2414 |
1.5501 | 171 | 4.242 |
1.5592 | 172 | 4.0547 |
1.5683 | 173 | 4.1509 |
1.5773 | 174 | 4.247 |
1.5864 | 175 | 4.3103 |
1.5955 | 176 | 4.0845 |
1.6045 | 177 | 4.0918 |
1.6136 | 178 | 4.1582 |
1.6227 | 179 | 4.2982 |
1.6317 | 180 | 4.0515 |
1.6408 | 181 | 4.0738 |
1.6499 | 182 | 4.2416 |
1.6589 | 183 | 4.1212 |
1.6680 | 184 | 4.174 |
1.6771 | 185 | 4.1369 |
1.6861 | 186 | 3.9908 |
1.6952 | 187 | 4.1155 |
1.7042 | 188 | 3.9893 |
1.7133 | 189 | 4.2362 |
1.7224 | 190 | 4.074 |
1.7314 | 191 | 4.0604 |
1.7405 | 192 | 4.0065 |
1.7496 | 193 | 4.0041 |
1.7586 | 194 | 4.0428 |
1.7677 | 195 | 4.0094 |
1.7768 | 196 | 3.962 |
1.7858 | 197 | 4.1932 |
1.7949 | 198 | 4.133 |
1.8040 | 199 | 4.1344 |
1.8130 | 200 | 4.1004 |
1.8221 | 201 | 4.0633 |
1.8312 | 202 | 4.0545 |
1.8402 | 203 | 4.0434 |
1.8493 | 204 | 4.0576 |
1.8584 | 205 | 4.0892 |
1.8674 | 206 | 4.1945 |
1.8765 | 207 | 4.0809 |
1.8856 | 208 | 4.0655 |
1.8946 | 209 | 4.155 |
1.9037 | 210 | 4.0801 |
1.9127 | 211 | 4.0837 |
1.9218 | 212 | 4.1487 |
1.9309 | 213 | 4.0574 |
1.9399 | 214 | 4.0952 |
1.9490 | 215 | 4.0414 |
1.9581 | 216 | 3.9645 |
1.9671 | 217 | 4.0327 |
1.9762 | 218 | 3.9183 |
1.9853 | 219 | 4.1204 |
1.9943 | 220 | 4.0043 |
2.0034 | 221 | 3.904 |
2.0125 | 222 | 4.0489 |
2.0215 | 223 | 4.0316 |
2.0306 | 224 | 3.9649 |
2.0397 | 225 | 3.891 |
2.0487 | 226 | 4.0352 |
2.0578 | 227 | 4.1811 |
2.0669 | 228 | 4.1212 |
2.0759 | 229 | 4.2356 |
2.0850 | 230 | 4.1295 |
2.0941 | 231 | 4.0231 |
2.1031 | 232 | 3.914 |
2.1122 | 233 | 3.916 |
2.1212 | 234 | 3.8657 |
2.1303 | 235 | 4.0986 |
2.1394 | 236 | 3.9774 |
2.1484 | 237 | 3.9112 |
2.1575 | 238 | 3.8232 |
2.1666 | 239 | 3.85 |
2.1756 | 240 | 3.8874 |
2.1847 | 241 | 3.6777 |
2.1938 | 242 | 3.7898 |
2.2028 | 243 | 3.8527 |
2.2119 | 244 | 3.7038 |
2.2210 | 245 | 3.9404 |
2.2300 | 246 | 3.7468 |
2.2391 | 247 | 3.7905 |
2.2482 | 248 | 3.8356 |
2.2572 | 249 | 3.9682 |
2.2663 | 250 | 3.9372 |
2.2754 | 251 | 3.7579 |
2.2844 | 252 | 3.6927 |
2.2935 | 253 | 3.7372 |
2.3025 | 254 | 3.6125 |
2.3116 | 255 | 4.0475 |
2.3207 | 256 | 3.7422 |
2.3297 | 257 | 3.8646 |
2.3388 | 258 | 3.6637 |
2.3479 | 259 | 3.8496 |
2.3569 | 260 | 3.753 |
2.3660 | 261 | 3.7632 |
2.3751 | 262 | 3.7097 |
2.3841 | 263 | 3.8584 |
2.3932 | 264 | 3.6547 |
2.4023 | 265 | 3.7595 |
2.4113 | 266 | 3.6346 |
2.4204 | 267 | 3.8937 |
2.4295 | 268 | 3.7423 |
2.4385 | 269 | 3.8051 |
2.4476 | 270 | 3.7131 |
2.4567 | 271 | 3.6623 |
2.4657 | 272 | 3.7444 |
2.4748 | 273 | 3.7229 |
2.4839 | 274 | 3.7874 |
2.4929 | 275 | 3.714 |
2.5020 | 276 | 3.6972 |
2.5110 | 277 | 3.7421 |
2.5201 | 278 | 3.8071 |
2.5292 | 279 | 3.7042 |
2.5382 | 280 | 3.7569 |
2.5473 | 281 | 3.8477 |
2.5564 | 282 | 3.7502 |
2.5654 | 283 | 3.7096 |
2.5745 | 284 | 3.7251 |
2.5836 | 285 | 3.8462 |
2.5926 | 286 | 3.747 |
2.6017 | 287 | 3.6436 |
2.6108 | 288 | 3.7176 |
2.6198 | 289 | 3.8406 |
2.6289 | 290 | 3.6416 |
2.6380 | 291 | 3.6793 |
2.6470 | 292 | 3.7892 |
2.6561 | 293 | 3.7827 |
2.6652 | 294 | 3.6192 |
2.6742 | 295 | 3.9168 |
2.6833 | 296 | 3.7271 |
2.6924 | 297 | 3.6852 |
2.7014 | 298 | 3.5507 |
2.7105 | 299 | 3.8567 |
2.7195 | 300 | 3.8098 |
2.7286 | 301 | 3.6685 |
2.7377 | 302 | 3.6163 |
2.7467 | 303 | 3.7439 |
2.7558 | 304 | 3.6212 |
2.7649 | 305 | 3.62 |
2.7739 | 306 | 3.6728 |
2.7830 | 307 | 3.7061 |
2.7921 | 308 | 3.8473 |
2.8011 | 309 | 3.7974 |
2.8102 | 310 | 3.6624 |
2.8193 | 311 | 3.7357 |
2.8283 | 312 | 3.7277 |
2.8374 | 313 | 3.6717 |
2.8465 | 314 | 3.7568 |
2.8555 | 315 | 3.6942 |
2.8646 | 316 | 3.7497 |
2.8737 | 317 | 3.7765 |
2.8827 | 318 | 3.709 |
2.8918 | 319 | 3.8016 |
2.9008 | 320 | 3.7998 |
2.9099 | 321 | 3.76 |
2.9190 | 322 | 3.748 |
2.9280 | 323 | 3.7235 |
2.9371 | 324 | 3.7455 |
2.9462 | 325 | 3.8345 |
2.9552 | 326 | 3.6403 |
2.9643 | 327 | 3.754 |
2.9734 | 328 | 3.6126 |
2.9824 | 329 | 3.7963 |
2.9915 | 330 | 3.8263 |
Framework Versions
- Python: 3.10.12
- Sentence Transformers: 3.2.1
- Transformers: 4.44.2
- PyTorch: 2.3.1+cu121
- Accelerate: 1.1.1
- Datasets: 2.21.0
- Tokenizers: 0.19.1
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
MatryoshkaLoss
@misc{kusupati2024matryoshka,
title={Matryoshka Representation Learning},
author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
year={2024},
eprint={2205.13147},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
MultipleNegativesRankingLoss
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
- Downloads last month
- 4
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the model is not deployed on the HF Inference API.
Model tree for seongil-dn/bge-m3-mrl-330
Base model
BAAI/bge-m3