--- base_model: Snowflake/snowflake-arctic-embed-m library_name: sentence-transformers metrics: - cosine_accuracy@1 - cosine_accuracy@3 - cosine_accuracy@5 - cosine_accuracy@10 - cosine_precision@1 - cosine_precision@3 - cosine_precision@5 - cosine_precision@10 - cosine_recall@1 - cosine_recall@3 - cosine_recall@5 - cosine_recall@10 - cosine_ndcg@10 - cosine_mrr@10 - cosine_map@100 - dot_accuracy@1 - dot_accuracy@3 - dot_accuracy@5 - dot_accuracy@10 - dot_precision@1 - dot_precision@3 - dot_precision@5 - dot_precision@10 - dot_recall@1 - dot_recall@3 - dot_recall@5 - dot_recall@10 - dot_ndcg@10 - dot_mrr@10 - dot_map@100 pipeline_tag: sentence-similarity tags: - sentence-transformers - sentence-similarity - feature-extraction - generated_from_trainer - dataset_size:600 - loss:MatryoshkaLoss - loss:MultipleNegativesRankingLoss widget: - source_sentence: How can high compute resource utilization in training GAI models affect ecosystems? sentences: - "should not be used in education, work, housing, or in other contexts where the\ \ use of such surveillance \ntechnologies is likely to limit rights, opportunities,\ \ or access. Whenever possible, you should have access to \nreporting that confirms\ \ your data decisions have been respected and provides an assessment of the \n\ potential impact of surveillance technologies on your rights, opportunities, or\ \ access. \nNOTICE AND EXPLANATION" - "Legal Disclaimer \nThe Blueprint for an AI Bill of Rights: Making Automated Systems\ \ Work for the American People is a white paper \npublished by the White House\ \ Office of Science and Technology Policy. It is intended to support the \ndevelopment\ \ of policies and practices that protect civil rights and promote democratic values\ \ in the building, \ndeployment, and governance of automated systems. \nThe Blueprint\ \ for an AI Bill of Rights is non-binding and does not constitute U.S. government\ \ policy. It \ndoes not supersede, modify, or direct an interpretation of any\ \ existing statute, regulation, policy, or \ninternational instrument. It does\ \ not constitute binding guidance for the public or Federal agencies and" - "or stereotyping content . \n4. Data Privacy: Impacts due to l eakage and unauthorized\ \ use, disclosure , or de -anonymization of \nbiometric, health, location , or\ \ other personally identifiable information or sensitive data .7 \n5. Environmental\ \ Impacts: Impacts due to high compute resource utilization in training or \n\ operating GAI models, and related outcomes that may adversely impact ecosystems.\ \ \n6. Harmful Bias or Homogenization: Amplification and exacerbation of historical,\ \ societal, and \nsystemic biases ; performance disparities8 between sub- groups\ \ or languages , possibly due to \nnon- representative training data , that result\ \ in discrimination, amplification of biases, or" - source_sentence: What are the potential risks associated with human-AI configuration in GAI systems? sentences: - "establish approved GAI technology and service provider lists. Value Chain and\ \ Component \nIntegration \nGV-6.1-0 08 Maintain records of changes to content\ \ made by third parties to promote content \nprovenance, including sources, timestamps,\ \ metadata . Information Integrity ; Value Chain \nand Component Integration;\ \ Intellectual Property \nGV-6.1-0 09 Update and integrate due diligence processes\ \ for GAI acquisition and \nprocurement vendor assessments to include intellectual\ \ property, data privacy, security, and other risks. For example, update p rocesses\ \ \nto: Address solutions that \nmay rely on embedded GAI technologies; Address\ \ ongoing monitoring , \nassessments, and alerting, dynamic risk assessments,\ \ and real -time reporting" - "could lead to homogenized outputs, including by amplifying any homogenization\ \ from the model used to \ngenerate the synthetic training data . \nTrustworthy\ \ AI Characteristics: Fair with Harmful Bias Managed, Valid and Reliable \n\ 2.7. Human -AI Configuration \nGAI system use can involve varying risks of misconfigurations\ \ and poor interactions between a system \nand a human who is interacti ng with\ \ it. Humans bring their unique perspectives , experiences , or domain -\nspecific\ \ expertise to interactions with AI systems but may not have detailed knowledge\ \ of AI systems and \nhow they work. As a result, h uman experts may be unnecessarily\ \ “averse ” to GAI systems , and thus \ndeprive themselves or others of GAI’s\ \ beneficial uses ." - "requests image features that are inconsistent with the stereotypes. Harmful\ \ b ias in GAI models , which \nmay stem from their training data , can also \ \ cause representational harm s or perpetuate or exacerbate \nbias based on\ \ race, gender, disability, or other protected classes . \nHarmful b ias in GAI\ \ systems can also lead to harms via disparities between how a model performs\ \ for \ndifferent subgroups or languages (e.g., an LLM may perform less well\ \ for non- English languages or \ncertain dialects ). Such disparities can contribute\ \ to discriminatory decision -making or amplification of \nexisting societal biases.\ \ In addition, GAI systems may be inappropriately trusted to perform similarly" - source_sentence: What types of content are considered harmful biases in the context of information security? sentences: - "MS-2.5-0 05 Verify GAI system training data and TEVV data provenance, and that\ \ fine -tuning \nor retrieval- augmented generation data is grounded. Information\ \ Integrity \nMS-2.5-0 06 Regularly review security and safety guardrails, especially\ \ if the GAI system is \nbeing operated in novel circumstances. This includes\ \ reviewing reasons why the \nGAI system was initially assessed as being safe\ \ to deploy. Information Security ; Dangerous , \nViolent, or Hateful Content\ \ \nAI Actor Tasks: Domain Experts, TEVV" - "to diminished transparency or accountability for downstream users. While this\ \ is a risk for traditional AI \nsystems and some other digital technologies\ \ , the risk is exacerbated for GAI due to the scale of the \ntraining data, which\ \ may be too large for humans to vet; the difficulty of training foundation models,\ \ \nwhich leads to extensive reuse of limited numbers of models; an d the extent\ \ to which GAI may be \nintegrat ed into other devices and services. As GAI\ \ systems often involve many distinct third -party \ncomponents and data sources\ \ , it may be difficult to attribute issues in a system’s behavior to any one of\ \ \nthese sources. \nErrors in t hird-party GAI components can also have downstream\ \ impacts on accuracy and robustness ." - "biases in the generated content. Information Security ; Harmful Bias \nand Homogenization\ \ \nMG-2.2-005 Engage in due diligence to analyze GAI output for harmful content,\ \ potential \nmisinformation , and CBRN -related or NCII content . CBRN Information\ \ or Capabilities ; \nObscene, Degrading, and/or \nAbusive Content ; Harmful Bias\ \ and \nHomogenization ; Dangerous , \nViolent, or Hateful Content" - source_sentence: What is the focus of the paper by Padmakumar et al (2024) regarding language models and content diversity? sentences: - "Content \nMS-2.12- 002 Document anticipated environmental impacts of model development,\ \ \nmaintenance, and deployment in product design decisions. Environmental \n\ MS-2.12- 003 Measure or estimate environmental impacts (e.g., energy and water\ \ \nconsumption) for training, fine tuning, and deploying models: Verify tradeoffs\ \ \nbetween resources used at inference time versus additional resources required\ \ at training time. Environmental \nMS-2.12- 004 Verify effectiveness of carbon\ \ capture or offset programs for GAI training and \napplications , and address\ \ green -washing concerns . Environmental \nAI Actor Tasks: AI Deployment, AI\ \ Impact Assessment, Domain Experts, Operation and Monitoring, TEVV" - "opportunities, undermine their privac y, or pervasively track their activity—often\ \ without their knowledge or \nconsent. \nThese outcomes are deeply harmful—but\ \ they are not inevitable. Automated systems have brought about extraor-\ndinary\ \ benefits, from technology that helps farmers grow food more efficiently and\ \ computers that predict storm \npaths, to algorithms that can identify diseases\ \ in patients. These tools now drive important decisions across \nsectors, while\ \ data is helping to revolutionize global industries. Fueled by the power of American\ \ innovation, \nthese tools hold the potential to redefine every part of our society\ \ and make life better for everyone." - "Publishing, Paris . https://doi.org/10.1787/d1a8d965- en \nOpenAI (2023) GPT-4\ \ System Card . https://cdn.openai.com/papers/gpt -4-system -card.pdf \nOpenAI\ \ (2024) GPT-4 Technical Report. https://arxiv.org/pdf/2303.08774 \nPadmakumar,\ \ V. et al. (2024) Does writing with language models reduce content diversity?\ \ ICLR . \nhttps://arxiv.org/pdf/2309.05196 \nPark, P. et. al. (2024) AI\ \ deception: A survey of examples, risks, and potential solutions. Patterns,\ \ 5(5). \narXiv . https://arxiv.org/pdf/2308.14752 \nPartnership on AI (2023)\ \ Building a Glossary for Synthetic Media Transparency Methods, Part 1: Indirect\ \ \nDisclosure . https://partnershiponai.org/glossary -for-synthetic -media- transparency\ \ -methods -part-1-\nindirect -disclosure/" - source_sentence: What are the key components involved in ensuring data quality and ethical considerations in AI systems? sentences: - "(such as where significant negative impacts are imminent, severe harms are actually\ \ occurring, or large -scale risks could occur); and broad GAI negative risks,\ \ \nincluding: Immature safety or risk cultures related to AI and GAI design,\ \ development and deployment, public information integrity risks, including impacts\ \ on democratic processes, unknown long -term performance characteristics of GAI.\ \ Information Integrity ; Dangerous , \nViolent, or Hateful Content ; CBRN \n\ Information or Capabilities \nGV-1.3-007 Devise a plan to halt development or\ \ deployment of a GAI system that poses unacceptable negative risk. CBRN Information\ \ and Capability ; \nInformation Security ; Information \nIntegrity \nAI Actor\ \ Tasks: Governance and Oversight" - "30 MEASURE 2.2: Evaluations involving human subjects meet applicable requirements\ \ (including human subject protection) and are \nrepresentative of the relevant\ \ population. \nAction ID Suggested Action GAI Risks \nMS-2.2-001 Assess and\ \ manage statistical biases related to GAI content provenance through \ntechniques\ \ such as re -sampling, re -weighting, or adversarial training. Information Integrity\ \ ; Information \nSecurity ; Harmful Bias and \nHomogenization \nMS-2.2-002 Document\ \ how content provenance data is tracked and how that data interact s \nwith\ \ privacy and security . Consider : Anonymiz ing data to protect the privacy\ \ of \nhuman subjects; Leverag ing privacy output filters; Remov ing any personally" - "Data quality; Model architecture (e.g., convolutional neural network, transformers,\ \ etc.); Optimizatio n objectives; Training algorithms; RLHF \napproaches; Fine\ \ -tuning or retrieval- augmented generation approaches; \nEvaluation data; Ethical\ \ considerations; Legal and regulatory requirements. Information Integrity ;\ \ Harmful Bias \nand Homogenization \nAI Actor Tasks: AI Deployment, AI Impact\ \ Assessment, Domain Experts, End -Users, Operation and Monitoring, TEVV \n \n\ MEASURE 2.10: Privacy risk of the AI system – as identified in the MAP function\ \ – is examined and documented. \nAction ID Suggested Action GAI Risks \n\ MS-2.10- 001 Conduct AI red -teaming to assess issues such as: Outputting of\ \ training data" model-index: - name: SentenceTransformer based on Snowflake/snowflake-arctic-embed-m results: - task: type: information-retrieval name: Information Retrieval dataset: name: Unknown type: unknown metrics: - type: cosine_accuracy@1 value: 0.8 name: Cosine Accuracy@1 - type: cosine_accuracy@3 value: 0.99 name: Cosine Accuracy@3 - type: cosine_accuracy@5 value: 0.99 name: Cosine Accuracy@5 - type: cosine_accuracy@10 value: 1.0 name: Cosine Accuracy@10 - type: cosine_precision@1 value: 0.8 name: Cosine Precision@1 - type: cosine_precision@3 value: 0.33000000000000007 name: Cosine Precision@3 - type: cosine_precision@5 value: 0.19799999999999998 name: Cosine Precision@5 - type: cosine_precision@10 value: 0.09999999999999998 name: Cosine Precision@10 - type: cosine_recall@1 value: 0.8 name: Cosine Recall@1 - type: cosine_recall@3 value: 0.99 name: Cosine Recall@3 - type: cosine_recall@5 value: 0.99 name: Cosine Recall@5 - type: cosine_recall@10 value: 1.0 name: Cosine Recall@10 - type: cosine_ndcg@10 value: 0.9195108324425135 name: Cosine Ndcg@10 - type: cosine_mrr@10 value: 0.8916666666666667 name: Cosine Mrr@10 - type: cosine_map@100 value: 0.8916666666666666 name: Cosine Map@100 - type: dot_accuracy@1 value: 0.8 name: Dot Accuracy@1 - type: dot_accuracy@3 value: 0.99 name: Dot Accuracy@3 - type: dot_accuracy@5 value: 0.99 name: Dot Accuracy@5 - type: dot_accuracy@10 value: 1.0 name: Dot Accuracy@10 - type: dot_precision@1 value: 0.8 name: Dot Precision@1 - type: dot_precision@3 value: 0.33000000000000007 name: Dot Precision@3 - type: dot_precision@5 value: 0.19799999999999998 name: Dot Precision@5 - type: dot_precision@10 value: 0.09999999999999998 name: Dot Precision@10 - type: dot_recall@1 value: 0.8 name: Dot Recall@1 - type: dot_recall@3 value: 0.99 name: Dot Recall@3 - type: dot_recall@5 value: 0.99 name: Dot Recall@5 - type: dot_recall@10 value: 1.0 name: Dot Recall@10 - type: dot_ndcg@10 value: 0.9195108324425135 name: Dot Ndcg@10 - type: dot_mrr@10 value: 0.8916666666666667 name: Dot Mrr@10 - type: dot_map@100 value: 0.8916666666666666 name: Dot Map@100 --- # SentenceTransformer based on Snowflake/snowflake-arctic-embed-m This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [Snowflake/snowflake-arctic-embed-m](https://huggingface.co/Snowflake/snowflake-arctic-embed-m). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more. ## Model Details ### Model Description - **Model Type:** Sentence Transformer - **Base model:** [Snowflake/snowflake-arctic-embed-m](https://huggingface.co/Snowflake/snowflake-arctic-embed-m) - **Maximum Sequence Length:** 512 tokens - **Output Dimensionality:** 768 tokens - **Similarity Function:** Cosine Similarity ### Model Sources - **Documentation:** [Sentence Transformers Documentation](https://sbert.net) - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers) - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers) ### Full Model Architecture ``` SentenceTransformer( (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True}) (2): Normalize() ) ``` ## Usage ### Direct Usage (Sentence Transformers) First install the Sentence Transformers library: ```bash pip install -U sentence-transformers ``` Then you can load this model and run inference. ```python from sentence_transformers import SentenceTransformer # Download from the 🤗 Hub model = SentenceTransformer("XicoC/midterm-finetuned-arctic") # Run inference sentences = [ 'What are the key components involved in ensuring data quality and ethical considerations in AI systems?', 'Data quality; Model architecture (e.g., convolutional neural network, transformers, etc.); Optimizatio n objectives; Training algorithms; RLHF \napproaches; Fine -tuning or retrieval- augmented generation approaches; \nEvaluation data; Ethical considerations; Legal and regulatory requirements. Information Integrity ; Harmful Bias \nand Homogenization \nAI Actor Tasks: AI Deployment, AI Impact Assessment, Domain Experts, End -Users, Operation and Monitoring, TEVV \n \nMEASURE 2.10: Privacy risk of the AI system – as identified in the MAP function – is examined and documented. \nAction ID Suggested Action GAI Risks \nMS-2.10- 001 Conduct AI red -teaming to assess issues such as: Outputting of training data', '30 MEASURE 2.2: Evaluations involving human subjects meet applicable requirements (including human subject protection) and are \nrepresentative of the relevant population. \nAction ID Suggested Action GAI Risks \nMS-2.2-001 Assess and manage statistical biases related to GAI content provenance through \ntechniques such as re -sampling, re -weighting, or adversarial training. Information Integrity ; Information \nSecurity ; Harmful Bias and \nHomogenization \nMS-2.2-002 Document how content provenance data is tracked and how that data interact s \nwith privacy and security . Consider : Anonymiz ing data to protect the privacy of \nhuman subjects; Leverag ing privacy output filters; Remov ing any personally', ] embeddings = model.encode(sentences) print(embeddings.shape) # [3, 768] # Get the similarity scores for the embeddings similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [3, 3] ``` ## Evaluation ### Metrics #### Information Retrieval * Evaluated with [InformationRetrievalEvaluator](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator) | Metric | Value | |:--------------------|:-----------| | cosine_accuracy@1 | 0.8 | | cosine_accuracy@3 | 0.99 | | cosine_accuracy@5 | 0.99 | | cosine_accuracy@10 | 1.0 | | cosine_precision@1 | 0.8 | | cosine_precision@3 | 0.33 | | cosine_precision@5 | 0.198 | | cosine_precision@10 | 0.1 | | cosine_recall@1 | 0.8 | | cosine_recall@3 | 0.99 | | cosine_recall@5 | 0.99 | | cosine_recall@10 | 1.0 | | cosine_ndcg@10 | 0.9195 | | cosine_mrr@10 | 0.8917 | | **cosine_map@100** | **0.8917** | | dot_accuracy@1 | 0.8 | | dot_accuracy@3 | 0.99 | | dot_accuracy@5 | 0.99 | | dot_accuracy@10 | 1.0 | | dot_precision@1 | 0.8 | | dot_precision@3 | 0.33 | | dot_precision@5 | 0.198 | | dot_precision@10 | 0.1 | | dot_recall@1 | 0.8 | | dot_recall@3 | 0.99 | | dot_recall@5 | 0.99 | | dot_recall@10 | 1.0 | | dot_ndcg@10 | 0.9195 | | dot_mrr@10 | 0.8917 | | dot_map@100 | 0.8917 | ## Training Details ### Training Dataset #### Unnamed Dataset * Size: 600 training samples * Columns: sentence_0 and sentence_1 * Approximate statistics based on the first 600 samples: | | sentence_0 | sentence_1 | |:--------|:-----------------------------------------------------------------------------------|:------------------------------------------------------------------------------------| | type | string | string | | details | | | * Samples: | sentence_0 | sentence_1 | |:-------------------------------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | What is the title of the NIST publication related to Artificial Intelligence Risk Management? | NIST Trustworthy and Responsible AI
NIST AI 600 -1
Artificial Intelligence Risk Management
Framework: Generative Artificial
Intelligence Profile


This publication is available free of charge from:
https://doi.org/10.6028/NIST.AI.600 -1
| | Where can the NIST AI 600 -1 publication be accessed for free? | NIST Trustworthy and Responsible AI
NIST AI 600 -1
Artificial Intelligence Risk Management
Framework: Generative Artificial
Intelligence Profile


This publication is available free of charge from:
https://doi.org/10.6028/NIST.AI.600 -1
| | What is the title of the publication released by NIST in July 2024 regarding artificial intelligence? | NIST Trustworthy and Responsible AI
NIST AI 600 -1
Artificial Intelligence Risk Management
Framework: Generative Artificial
Intelligence Profile


This publication is available free of charge from:
https://doi.org/10.6028/NIST.AI.600 -1

July 2024




U.S. Department of Commerce
Gina M. Raimondo, Secretary
National Institute of Standards and Technology
Laurie E. Locascio, NIST Director and Under Secretary of Commerce for Standards and Technology
| * Loss: [MatryoshkaLoss](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#matryoshkaloss) with these parameters: ```json { "loss": "MultipleNegativesRankingLoss", "matryoshka_dims": [ 768, 512, 256, 128, 64 ], "matryoshka_weights": [ 1, 1, 1, 1, 1 ], "n_dims_per_step": -1 } ``` ### Training Hyperparameters #### Non-Default Hyperparameters - `eval_strategy`: steps - `per_device_train_batch_size`: 20 - `per_device_eval_batch_size`: 20 - `num_train_epochs`: 5 - `multi_dataset_batch_sampler`: round_robin #### All Hyperparameters
Click to expand - `overwrite_output_dir`: False - `do_predict`: False - `eval_strategy`: steps - `prediction_loss_only`: True - `per_device_train_batch_size`: 20 - `per_device_eval_batch_size`: 20 - `per_gpu_train_batch_size`: None - `per_gpu_eval_batch_size`: None - `gradient_accumulation_steps`: 1 - `eval_accumulation_steps`: None - `torch_empty_cache_steps`: None - `learning_rate`: 5e-05 - `weight_decay`: 0.0 - `adam_beta1`: 0.9 - `adam_beta2`: 0.999 - `adam_epsilon`: 1e-08 - `max_grad_norm`: 1 - `num_train_epochs`: 5 - `max_steps`: -1 - `lr_scheduler_type`: linear - `lr_scheduler_kwargs`: {} - `warmup_ratio`: 0.0 - `warmup_steps`: 0 - `log_level`: passive - `log_level_replica`: warning - `log_on_each_node`: True - `logging_nan_inf_filter`: True - `save_safetensors`: True - `save_on_each_node`: False - `save_only_model`: False - `restore_callback_states_from_checkpoint`: False - `no_cuda`: False - `use_cpu`: False - `use_mps_device`: False - `seed`: 42 - `data_seed`: None - `jit_mode_eval`: False - `use_ipex`: False - `bf16`: False - `fp16`: False - `fp16_opt_level`: O1 - `half_precision_backend`: auto - `bf16_full_eval`: False - `fp16_full_eval`: False - `tf32`: None - `local_rank`: 0 - `ddp_backend`: None - `tpu_num_cores`: None - `tpu_metrics_debug`: False - `debug`: [] - `dataloader_drop_last`: False - `dataloader_num_workers`: 0 - `dataloader_prefetch_factor`: None - `past_index`: -1 - `disable_tqdm`: False - `remove_unused_columns`: True - `label_names`: None - `load_best_model_at_end`: False - `ignore_data_skip`: False - `fsdp`: [] - `fsdp_min_num_params`: 0 - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False} - `fsdp_transformer_layer_cls_to_wrap`: None - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None} - `deepspeed`: None - `label_smoothing_factor`: 0.0 - `optim`: adamw_torch - `optim_args`: None - `adafactor`: False - `group_by_length`: False - `length_column_name`: length - `ddp_find_unused_parameters`: None - `ddp_bucket_cap_mb`: None - `ddp_broadcast_buffers`: False - `dataloader_pin_memory`: True - `dataloader_persistent_workers`: False - `skip_memory_metrics`: True - `use_legacy_prediction_loop`: False - `push_to_hub`: False - `resume_from_checkpoint`: None - `hub_model_id`: None - `hub_strategy`: every_save - `hub_private_repo`: False - `hub_always_push`: False - `gradient_checkpointing`: False - `gradient_checkpointing_kwargs`: None - `include_inputs_for_metrics`: False - `eval_do_concat_batches`: True - `fp16_backend`: auto - `push_to_hub_model_id`: None - `push_to_hub_organization`: None - `mp_parameters`: - `auto_find_batch_size`: False - `full_determinism`: False - `torchdynamo`: None - `ray_scope`: last - `ddp_timeout`: 1800 - `torch_compile`: False - `torch_compile_backend`: None - `torch_compile_mode`: None - `dispatch_batches`: None - `split_batches`: None - `include_tokens_per_second`: False - `include_num_input_tokens_seen`: False - `neftune_noise_alpha`: None - `optim_target_modules`: None - `batch_eval_metrics`: False - `eval_on_start`: False - `eval_use_gather_object`: False - `batch_sampler`: batch_sampler - `multi_dataset_batch_sampler`: round_robin
### Training Logs | Epoch | Step | cosine_map@100 | |:------:|:----:|:--------------:| | 1.0 | 30 | 0.8722 | | 1.6667 | 50 | 0.8817 | | 2.0 | 60 | 0.8867 | | 3.0 | 90 | 0.8867 | | 3.3333 | 100 | 0.8917 | ### Framework Versions - Python: 3.10.12 - Sentence Transformers: 3.1.0 - Transformers: 4.44.2 - PyTorch: 2.4.0+cu121 - Accelerate: 0.34.2 - Datasets: 2.19.2 - Tokenizers: 0.19.1 ## Citation ### BibTeX #### Sentence Transformers ```bibtex @inproceedings{reimers-2019-sentence-bert, title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks", author = "Reimers, Nils and Gurevych, Iryna", booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing", month = "11", year = "2019", publisher = "Association for Computational Linguistics", url = "https://arxiv.org/abs/1908.10084", } ``` #### MatryoshkaLoss ```bibtex @misc{kusupati2024matryoshka, title={Matryoshka Representation Learning}, author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi}, year={2024}, eprint={2205.13147}, archivePrefix={arXiv}, primaryClass={cs.LG} } ``` #### MultipleNegativesRankingLoss ```bibtex @misc{henderson2017efficient, title={Efficient Natural Language Response Suggestion for Smart Reply}, author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil}, year={2017}, eprint={1705.00652}, archivePrefix={arXiv}, primaryClass={cs.CL} } ```