Add new SentenceTransformer model.

Browse files

Files changed (11) hide show

1_Pooling/config.json +10 -0
README.md +424 -0
config.json +32 -0
config_sentence_transformers.json +10 -0
model.safetensors +3 -0
modules.json +20 -0
sentence_bert_config.json +4 -0
special_tokens_map.json +37 -0
tokenizer.json +0 -0
tokenizer_config.json +57 -0
vocab.txt +0 -0

1_Pooling/config.json ADDED Viewed

	@@ -0,0 +1,10 @@

+{
+  "word_embedding_dimension": 768,
+  "pooling_mode_cls_token": true,
+  "pooling_mode_mean_tokens": false,
+  "pooling_mode_max_tokens": false,
+  "pooling_mode_mean_sqrt_len_tokens": false,
+  "pooling_mode_weightedmean_tokens": false,
+  "pooling_mode_lasttoken": false,
+  "include_prompt": true
+}

README.md ADDED Viewed

	@@ -0,0 +1,424 @@

+---
+base_model: BAAI/bge-base-en-v1.5
+library_name: sentence-transformers
+pipeline_tag: sentence-similarity
+tags:
+- sentence-transformers
+- sentence-similarity
+- feature-extraction
+- generated_from_trainer
+- dataset_size:143
+- loss:MultipleNegativesRankingLoss
+widget:
+- source_sentence: 'JSON APIs: Node.js'
+  sentences:
+  - 'Prerequisite course required: RESTful APIs: Node.js'
+  - 'Course Name:JSON APIs: Node.js|Course Description:An introduction to JSON API,
+    using Node.js.|Course language: JavaScript|Prerequisite course required: RESTful
+    APIs: Node.js|Target Audience:Professionals who would like to learn the core concepts
+    of JSON API, using Node.js.'
+  - An introduction to JSON API, using Node.js.
+  - 'Course language: JavaScript'
+  - Professionals who would like to learn the core concepts of JSON API, using Node.js.
+- source_sentence: Enzyme
+  sentences:
+  - For anyone who has built an application in React and wants to test the React components
+  - A course that explores Enzyme, which is a JavaScript utility for React applications.
+    The course equips users to simulate runs and test React components' outputs.
+  - 'Prerequisite course required: React Testing Library'
+  - 'Course language: TBD'
+  - 'Course Name:Enzyme|Course Description:A course that explores Enzyme, which is
+    a JavaScript utility for React applications. The course equips users to simulate
+    runs and test React components'' outputs.|Course language: TBD|Prerequisite course
+    required: React Testing Library|Target Audience:For anyone who has built an application
+    in React and wants to test the React components'
+- source_sentence: 'React Ecosystem: State Management & Redux'
+  sentences:
+  - 'Course Name:React Ecosystem: State Management & Redux|Course Description:A course
+    that builds on the React Ecosystem. It explains how state management works in
+    React and goes over the Redux state management library|Course language: JavaScript|Prerequisite
+    course required: React Ecosystem: Forms|Target Audience:Professionals who would
+    like to learn about state management in React'
+  - 'Course language: JavaScript'
+  - 'Prerequisite course required: React Ecosystem: Forms'
+  - A course that builds on the React Ecosystem. It explains how state management
+    works in React and goes over the Redux state management library
+  - Professionals who would like to learn about state management in React
+- source_sentence: Ensemble Methods in Python
+  sentences:
+  - 'Course language: Python'
+  - 'Prerequisite course required: Decision Trees'
+  - This course covers an overview of ensemble learning methods like random forest
+    and boosting. At the end of this course, students will be able to implement and
+    compare random forest algorithm and boosting.
+  - Professionals with some experience in building basic algorithms who would like
+    to expand their skill set to more advanced Python classification techniques.
+  - 'Course Name:Ensemble Methods in Python|Course Description:This course covers
+    an overview of ensemble learning methods like random forest and boosting. At the
+    end of this course, students will be able to implement and compare random forest
+    algorithm and boosting.|Course language: Python|Prerequisite course required:
+    Decision Trees|Target Audience:Professionals with some experience in building
+    basic algorithms who would like to expand their skill set to more advanced Python
+    classification techniques.'
+- source_sentence: Visualizing Data with Matplotlib in Python
+  sentences:
+  - Professionals with basic Python experience who would like to expand their skill
+    set to more Python visualization techniques and tools.
+  - 'Prerequisite course required: Intro to Python'
+  - 'Course language: Python'
+  - 'Course Name:Visualizing Data with Matplotlib in Python|Course Description:This
+    course covers the basics of data visualization and exploratory data analysis.
+    It helps students learn different plots and their use cases.|Course language:
+    Python|Prerequisite course required: Intro to Python|Target Audience:Professionals
+    with basic Python experience who would like to expand their skill set to more
+    Python visualization techniques and tools.'
+  - This course covers the basics of data visualization and exploratory data analysis.
+    It helps students learn different plots and their use cases.
+---
+# SentenceTransformer based on BAAI/bge-base-en-v1.5
+This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [BAAI/bge-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
+## Model Details
+### Model Description
+- **Model Type:** Sentence Transformer
+- **Base model:** [BAAI/bge-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) <!-- at revision a5beb1e3e68b9ab74eb54cfd186867f64f240e1a -->
+- **Maximum Sequence Length:** 512 tokens
+- **Output Dimensionality:** 768 tokens
+- **Similarity Function:** Cosine Similarity
+<!-- - **Training Dataset:** Unknown -->
+<!-- - **Language:** Unknown -->
+<!-- - **License:** Unknown -->
+### Model Sources
+- **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
+- **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
+- **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
+### Full Model Architecture
+```
+SentenceTransformer(
+  (0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel
+  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
+  (2): Normalize()
+)
+```
+## Usage
+### Direct Usage (Sentence Transformers)
+First install the Sentence Transformers library:
+```bash
+pip install -U sentence-transformers
+```
+Then you can load this model and run inference.
+```python
+from sentence_transformers import SentenceTransformer
+# Download from the 🤗 Hub
+model = SentenceTransformer("datasocietyco/bge-base-en-v1.5-course-recommender-v2")
+# Run inference
+sentences = [
+    'Visualizing Data with Matplotlib in Python',
+    'This course covers the basics of data visualization and exploratory data analysis. It helps students learn different plots and their use cases.',
+    'Course language: Python',
+]
+embeddings = model.encode(sentences)
+print(embeddings.shape)
+# [3, 768]
+# Get the similarity scores for the embeddings
+similarities = model.similarity(embeddings, embeddings)
+print(similarities.shape)
+# [3, 3]
+```
+<!--
+### Direct Usage (Transformers)
+<details><summary>Click to see the direct usage in Transformers</summary>
+</details>
+-->
+<!--
+### Downstream Usage (Sentence Transformers)
+You can finetune this model on your own dataset.
+<details><summary>Click to expand</summary>
+</details>
+-->
+<!--
+### Out-of-Scope Use
+*List how the model may foreseeably be misused and address what users ought not to do with the model.*
+-->
+<!--
+## Bias, Risks and Limitations
+*What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
+-->
+<!--
+### Recommendations
+*What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
+-->
+## Training Details
+### Training Dataset
+#### Unnamed Dataset
+* Size: 143 training samples
+* Columns: <code>name</code>, <code>description</code>, <code>languages</code>, <code>prerequisites</code>, <code>target_audience</code>, and <code>combined</code>
+* Approximate statistics based on the first 143 samples:
+  |         | name                                                                             | description                                                                         | languages                                                                        | prerequisites                                                                     | target_audience                                                                    | combined                                                                           |
+  |:--------|:---------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|
+  | type    | string                                                                           | string                                                                              | string                                                                           | string                                                                            | string                                                                             | string                                                                             |
+  | details | <ul><li>min: 3 tokens</li><li>mean: 7.82 tokens</li><li>max: 17 tokens</li></ul> | <ul><li>min: 13 tokens</li><li>mean: 39.24 tokens</li><li>max: 117 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 6.57 tokens</li><li>max: 10 tokens</li></ul> | <ul><li>min: 8 tokens</li><li>mean: 12.85 tokens</li><li>max: 22 tokens</li></ul> | <ul><li>min: 12 tokens</li><li>mean: 23.02 tokens</li><li>max: 54 tokens</li></ul> | <ul><li>min: 58 tokens</li><li>mean: 94.5 tokens</li><li>max: 187 tokens</li></ul> |
+* Samples:
+  | name                                                            | description                                                                                                                                                                                                                                                                                                                            | languages                            | prerequisites                                                                              | target_audience                                                                                                                                                                                                                              | combined                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 |
+  |:----------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-------------------------------------|:-------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+  | <code>Reinforcement Learning</code>                             | <code>This course covers the specialized branch of machine learning and deep learning called reinforcement learning (RL). By the end of this course students will be able to define RL use cases and real world scenarios where RL models are used, they will be able to create a simple RL model and evaluate its performance.</code> | <code>Course language: Python</code> | <code>Prerequisite course required: Working with Complex Pre-trained CNNs in Python</code> | <code>Professionals some Python experience who would like to expand their skillset to more advanced machine learning algorithms for reinforcement learning.</code>                                                                           | <code>Course Name:Reinforcement Learning|Course Description:This course covers the specialized branch of machine learning and deep learning called reinforcement learning (RL). By the end of this course students will be able to define RL use cases and real world scenarios where RL models are used, they will be able to create a simple RL model and evaluate its performance.|Course language: Python|Prerequisite course required: Working with Complex Pre-trained CNNs in Python|Target Audience:Professionals some Python experience who would like to expand their skillset to more advanced machine learning algorithms for reinforcement learning.</code> |
+  | <code>Optimizing Ensemble Methods in Python</code>              | <code>This course covers advanced topics in optimizing ensemble learning methods – specifically random forest and gradient boosting. Students will learn to implement base models and perform hyperparameter tuning to enhance the performance of models.</code>                                                                       | <code>Course language: Python</code> | <code>Prerequisite course required: Ensemble Methods in Python</code>                      | <code>Professionals experience in ensemble methods and who want to enhance their skill set in advanced Python classification techniques.</code>                                                                                              | <code>Course Name:Optimizing Ensemble Methods in Python|Course Description:This course covers advanced topics in optimizing ensemble learning methods – specifically random forest and gradient boosting. Students will learn to implement base models and perform hyperparameter tuning to enhance the performance of models.|Course language: Python|Prerequisite course required: Ensemble Methods in Python|Target Audience:Professionals experience in ensemble methods and who want to enhance their skill set in advanced Python classification techniques.</code>                                                                                                |
+  | <code>Fundamentals of Accelerated Computing with OpenACC</code> | <code>Find out how to write and configure code parallelization with OpenACC, optimize memory movements between the CPU and GPU accelerator, and apply the techniques to accelerate a CPU-only Laplace Heat Equation to achieve performance gains.</code>                                                                               | <code>Course language: Python</code> | <code>No prerequisite course required</code>                                               | <code>Professionals who want to learn how to write code, configure code parallelization with OpenACC, optimize memory movements between the CPU and GPU accelerator, and implement the workflow learnt for massive performance gains.</code> | <code>Course Name:Fundamentals of Accelerated Computing with OpenACC|Course Description:Find out how to write and configure code parallelization with OpenACC, optimize memory movements between the CPU and GPU accelerator, and apply the techniques to accelerate a CPU-only Laplace Heat Equation to achieve performance gains.|Course language: Python|No prerequisite course required|Target Audience:Professionals who want to learn how to write code, configure code parallelization with OpenACC, optimize memory movements between the CPU and GPU accelerator, and implement the workflow learnt for massive performance gains.</code>                       |
+* Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
+  ```json
+  {
+      "scale": 20.0,
+      "similarity_fct": "cos_sim"
+  }
+  ```
+### Evaluation Dataset
+#### Unnamed Dataset
+* Size: 36 evaluation samples
+* Columns: <code>name</code>, <code>description</code>, <code>languages</code>, <code>prerequisites</code>, <code>target_audience</code>, and <code>combined</code>
+* Approximate statistics based on the first 36 samples:
+  |         | name                                                                             | description                                                                        | languages                                                                        | prerequisites                                                                     | target_audience                                                                   | combined                                                                             |
+  |:--------|:---------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:---------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|
+  | type    | string                                                                           | string                                                                             | string                                                                           | string                                                                            | string                                                                            | string                                                                               |
+  | details | <ul><li>min: 3 tokens</li><li>mean: 7.92 tokens</li><li>max: 13 tokens</li></ul> | <ul><li>min: 13 tokens</li><li>mean: 46.39 tokens</li><li>max: 92 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 6.75 tokens</li><li>max: 10 tokens</li></ul> | <ul><li>min: 8 tokens</li><li>mean: 13.47 tokens</li><li>max: 20 tokens</li></ul> | <ul><li>min: 5 tokens</li><li>mean: 23.75 tokens</li><li>max: 54 tokens</li></ul> | <ul><li>min: 61 tokens</li><li>mean: 103.28 tokens</li><li>max: 165 tokens</li></ul> |
+* Samples:
+  | name                                                            | description                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      | languages                               | prerequisites                                                               | target_audience                                                                                                                                                  | combined                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |
+  |:----------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------|:----------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+  | <code>Intro to CSS, Part 2</code>                               | <code>A course that continues to build on the foundational understanding of CSS syntax and allows students to work with responsive design and media queries.</code>                                                                                                                                                                                                                                                                                                                                              | <code>Course language: CSS, HTML</code> | <code>Prerequisite course required: Intro to CSS, Part 1</code>             | <code>Professionals who would like to continue learning the core concepts of CSS and be able to style simple web pages.</code>                                   | <code>Course Name:Intro to CSS, Part 2|Course Description:A course that continues to build on the foundational understanding of CSS syntax and allows students to work with responsive design and media queries.|Course language: CSS, HTML|Prerequisite course required: Intro to CSS, Part 1|Target Audience:Professionals who would like to continue learning the core concepts of CSS and be able to style simple web pages.</code>                                                                                                                                                                                                                                                                                                                                                                                              |
+  | <code>Foundations of Statistics in Python</code>                | <code>This course is designed for learners who would like to learn about statistics and apply it for decision-making. This course is a comprehensive review of statistical terms ranging from foundational (mean, median, mode, standard deviation, variance, covariance, correlation) to more complex concepts such as normality in data, confidence intervals, and p-values. Additional topics include how to calculate summary statistics and how to carry out hypothesis testing to inform decisions.</code> | <code>Course language: Python</code>    | <code>Prerequisite course required: Intro to Visualization in Python</code> | <code>Professionals some Python experience who would like to expand their skill set to more advanced Python visualization techniques and tools.</code>           | <code>Course Name:Foundations of Statistics in Python|Course Description:This course is designed for learners who would like to learn about statistics and apply it for decision-making. This course is a comprehensive review of statistical terms ranging from foundational (mean, median, mode, standard deviation, variance, covariance, correlation) to more complex concepts such as normality in data, confidence intervals, and p-values. Additional topics include how to calculate summary statistics and how to carry out hypothesis testing to inform decisions.|Course language: Python|Prerequisite course required: Intro to Visualization in Python|Target Audience:Professionals some Python experience who would like to expand their skill set to more advanced Python visualization techniques and tools.</code> |
+  | <code>Spherical k-Means and Hierarchical Clustering in R</code> | <code>This course covers the unsupervised learning method called clustering which is used to find patterns or groups in data without the need for labelled data. This course includes different methods of clustering on numerical data including density-based and hierarchical-based clustering and how to build, evaluate and interpret these models.</code>                                                                                                                                                  | <code>Course language: R</code>         | <code>Prerequisite course required: Intro to Clustering in R</code>         | <code>Professionals with some R experience who would like to expand their skillset to more clustering techniques like hierarchical clustering and DBSCAN.</code> | <code>Course Name:Spherical k-Means and Hierarchical Clustering in R|Course Description:This course covers the unsupervised learning method called clustering which is used to find patterns or groups in data without the need for labelled data. This course includes different methods of clustering on numerical data including density-based and hierarchical-based clustering and how to build, evaluate and interpret these models.|Course language: R|Prerequisite course required: Intro to Clustering in R|Target Audience:Professionals with some R experience who would like to expand their skillset to more clustering techniques like hierarchical clustering and DBSCAN.</code>                                                                                                                                      |
+* Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
+  ```json
+  {
+      "scale": 20.0,
+      "similarity_fct": "cos_sim"
+  }
+  ```
+### Training Hyperparameters
+#### Non-Default Hyperparameters
+- `eval_strategy`: steps
+- `per_device_train_batch_size`: 16
+- `per_device_eval_batch_size`: 16
+- `learning_rate`: 3e-06
+- `max_steps`: 64
+- `warmup_ratio`: 0.1
+- `batch_sampler`: no_duplicates
+#### All Hyperparameters
+<details><summary>Click to expand</summary>
+- `overwrite_output_dir`: False
+- `do_predict`: False
+- `eval_strategy`: steps
+- `prediction_loss_only`: True
+- `per_device_train_batch_size`: 16
+- `per_device_eval_batch_size`: 16
+- `per_gpu_train_batch_size`: None
+- `per_gpu_eval_batch_size`: None
+- `gradient_accumulation_steps`: 1
+- `eval_accumulation_steps`: None
+- `torch_empty_cache_steps`: None
+- `learning_rate`: 3e-06
+- `weight_decay`: 0.0
+- `adam_beta1`: 0.9
+- `adam_beta2`: 0.999
+- `adam_epsilon`: 1e-08
+- `max_grad_norm`: 1.0
+- `num_train_epochs`: 3.0
+- `max_steps`: 64
+- `lr_scheduler_type`: linear
+- `lr_scheduler_kwargs`: {}
+- `warmup_ratio`: 0.1
+- `warmup_steps`: 0
+- `log_level`: passive
+- `log_level_replica`: warning
+- `log_on_each_node`: True
+- `logging_nan_inf_filter`: True
+- `save_safetensors`: True
+- `save_on_each_node`: False
+- `save_only_model`: False
+- `restore_callback_states_from_checkpoint`: False
+- `no_cuda`: False
+- `use_cpu`: False
+- `use_mps_device`: False
+- `seed`: 42
+- `data_seed`: None
+- `jit_mode_eval`: False
+- `use_ipex`: False
+- `bf16`: False
+- `fp16`: False
+- `fp16_opt_level`: O1
+- `half_precision_backend`: auto
+- `bf16_full_eval`: False
+- `fp16_full_eval`: False
+- `tf32`: None
+- `local_rank`: 0
+- `ddp_backend`: None
+- `tpu_num_cores`: None
+- `tpu_metrics_debug`: False
+- `debug`: []
+- `dataloader_drop_last`: False
+- `dataloader_num_workers`: 0
+- `dataloader_prefetch_factor`: None
+- `past_index`: -1
+- `disable_tqdm`: False
+- `remove_unused_columns`: True
+- `label_names`: None
+- `load_best_model_at_end`: False
+- `ignore_data_skip`: False
+- `fsdp`: []
+- `fsdp_min_num_params`: 0
+- `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
+- `fsdp_transformer_layer_cls_to_wrap`: None
+- `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
+- `deepspeed`: None
+- `label_smoothing_factor`: 0.0
+- `optim`: adamw_torch
+- `optim_args`: None
+- `adafactor`: False
+- `group_by_length`: False
+- `length_column_name`: length
+- `ddp_find_unused_parameters`: None
+- `ddp_bucket_cap_mb`: None
+- `ddp_broadcast_buffers`: False
+- `dataloader_pin_memory`: True
+- `dataloader_persistent_workers`: False
+- `skip_memory_metrics`: True
+- `use_legacy_prediction_loop`: False
+- `push_to_hub`: False
+- `resume_from_checkpoint`: None
+- `hub_model_id`: None
+- `hub_strategy`: every_save
+- `hub_private_repo`: False
+- `hub_always_push`: False
+- `gradient_checkpointing`: False
+- `gradient_checkpointing_kwargs`: None
+- `include_inputs_for_metrics`: False
+- `eval_do_concat_batches`: True
+- `fp16_backend`: auto
+- `push_to_hub_model_id`: None
+- `push_to_hub_organization`: None
+- `mp_parameters`:
+- `auto_find_batch_size`: False
+- `full_determinism`: False
+- `torchdynamo`: None
+- `ray_scope`: last
+- `ddp_timeout`: 1800
+- `torch_compile`: False
+- `torch_compile_backend`: None
+- `torch_compile_mode`: None
+- `dispatch_batches`: None
+- `split_batches`: None
+- `include_tokens_per_second`: False
+- `include_num_input_tokens_seen`: False
+- `neftune_noise_alpha`: None
+- `optim_target_modules`: None
+- `batch_eval_metrics`: False
+- `eval_on_start`: False
+- `use_liger_kernel`: False
+- `eval_use_gather_object`: False
+- `batch_sampler`: no_duplicates
+- `multi_dataset_batch_sampler`: proportional
+</details>
+### Training Logs
+| Epoch  | Step | Training Loss | loss   |
+|:------:|:----:|:-------------:|:------:|
+| 2.2222 | 20   | 1.5188        | 1.1718 |
+| 4.4444 | 40   | 1.0652        | 0.8327 |
+| 6.6667 | 60   | 0.677         | 0.7192 |
+### Framework Versions
+- Python: 3.9.13
+- Sentence Transformers: 3.1.1
+- Transformers: 4.45.1
+- PyTorch: 2.2.2
+- Accelerate: 0.34.2
+- Datasets: 3.0.0
+- Tokenizers: 0.20.0
+## Citation
+### BibTeX
+#### Sentence Transformers
+```bibtex
+@inproceedings{reimers-2019-sentence-bert,
+    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
+    author = "Reimers, Nils and Gurevych, Iryna",
+    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
+    month = "11",
+    year = "2019",
+    publisher = "Association for Computational Linguistics",
+    url = "https://arxiv.org/abs/1908.10084",
+}
+```
+#### MultipleNegativesRankingLoss
+```bibtex
+@misc{henderson2017efficient,
+    title={Efficient Natural Language Response Suggestion for Smart Reply},
+    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
+    year={2017},
+    eprint={1705.00652},
+    archivePrefix={arXiv},
+    primaryClass={cs.CL}
+}
+```
+<!--
+## Glossary
+*Clearly define terms in order to be accessible across audiences.*
+-->
+<!--
+## Model Card Authors
+*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
+-->
+<!--
+## Model Card Contact
+*Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
+-->

config.json ADDED Viewed

	@@ -0,0 +1,32 @@

+{
+  "_name_or_path": "BAAI/bge-base-en-v1.5",
+  "architectures": [
+    "BertModel"
+  ],
+  "attention_probs_dropout_prob": 0.1,
+  "classifier_dropout": null,
+  "gradient_checkpointing": false,
+  "hidden_act": "gelu",
+  "hidden_dropout_prob": 0.1,
+  "hidden_size": 768,
+  "id2label": {
+    "0": "LABEL_0"
+  },
+  "initializer_range": 0.02,
+  "intermediate_size": 3072,
+  "label2id": {
+    "LABEL_0": 0
+  },
+  "layer_norm_eps": 1e-12,
+  "max_position_embeddings": 512,
+  "model_type": "bert",
+  "num_attention_heads": 12,
+  "num_hidden_layers": 12,
+  "pad_token_id": 0,
+  "position_embedding_type": "absolute",
+  "torch_dtype": "float32",
+  "transformers_version": "4.45.1",
+  "type_vocab_size": 2,
+  "use_cache": true,
+  "vocab_size": 30522
+}

config_sentence_transformers.json ADDED Viewed

	@@ -0,0 +1,10 @@

+{
+  "__version__": {
+    "sentence_transformers": "3.1.1",
+    "transformers": "4.45.1",
+    "pytorch": "2.2.2"
+  },
+  "prompts": {},
+  "default_prompt_name": null,
+  "similarity_fn_name": null
+}

model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:79b4d26aaf77276af894a178bf468281f26c73ff30822e4261221540fe1a1991
+size 437951328

modules.json ADDED Viewed

	@@ -0,0 +1,20 @@

+[
+  {
+    "idx": 0,
+    "name": "0",
+    "path": "",
+    "type": "sentence_transformers.models.Transformer"
+  },
+  {
+    "idx": 1,
+    "name": "1",
+    "path": "1_Pooling",
+    "type": "sentence_transformers.models.Pooling"
+  },
+  {
+    "idx": 2,
+    "name": "2",
+    "path": "2_Normalize",
+    "type": "sentence_transformers.models.Normalize"
+  }
+]

sentence_bert_config.json ADDED Viewed

	@@ -0,0 +1,4 @@

+{
+  "max_seq_length": 512,
+  "do_lower_case": true
+}

special_tokens_map.json ADDED Viewed

	@@ -0,0 +1,37 @@

+{
+  "cls_token": {
+    "content": "[CLS]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "mask_token": {
+    "content": "[MASK]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "pad_token": {
+    "content": "[PAD]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "sep_token": {
+    "content": "[SEP]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "unk_token": {
+    "content": "[UNK]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  }
+}

tokenizer.json ADDED Viewed

The diff for this file is too large to render. See raw diff

tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,57 @@

+{
+  "added_tokens_decoder": {
+    "0": {
+      "content": "[PAD]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "100": {
+      "content": "[UNK]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "101": {
+      "content": "[CLS]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "102": {
+      "content": "[SEP]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "103": {
+      "content": "[MASK]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    }
+  },
+  "clean_up_tokenization_spaces": true,
+  "cls_token": "[CLS]",
+  "do_basic_tokenize": true,
+  "do_lower_case": true,
+  "mask_token": "[MASK]",
+  "model_max_length": 512,
+  "never_split": null,
+  "pad_token": "[PAD]",
+  "sep_token": "[SEP]",
+  "strip_accents": null,
+  "tokenize_chinese_chars": true,
+  "tokenizer_class": "BertTokenizer",
+  "unk_token": "[UNK]"
+}

vocab.txt ADDED Viewed

The diff for this file is too large to render. See raw diff