---
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:2382
- loss:MultipleNegativesRankingLoss
base_model: nomic-ai/nomic-embed-text-v1
widget:
- source_sentence: Collect the details that are associated with product '- Com espessura
constante de' '- 0,04 m', with quantity 1900, unit M2
sentences:
- 'Item Description: UNKNOWN PRODUCT, priced at 949.00 EUR, Origin: National'
- 'Product: UNKNOWN PRODUCT, Estimated Value: 514.00 EUR'
- "Details for 'MacBook Pro 14\" Processador M2/3 16GB/18GB RAM | SSD 512GB | Teclado\
\ Es-Es', with quantity 1, unit UN:\n - LOTE 31\n - Price: 656.00 EUR"
- source_sentence: Collect the details that are associated with Lot 14 product ''
'Monitor de Sinais Vitais ', with quantity 2, unit Subcontracting Unit
sentences:
- "Details for 'Monitor de Sinais Vitais ', with quantity 2, unit Subcontracting\
\ Unit:\n - LOTE 60\n - Price: 564.00 EUR"
- "Details for UNKNOWN PRODUCT:\n - LOTE 90\n - Price: 658.00 EUR"
- 'Item Description: UNKNOWN PRODUCT, priced at 90.00 EUR, Origin: National'
- source_sentence: Collect the details that are associated with product '' '2202000270
- FIO SUT. AC. POLIGLIC. ABS. RÁPIDA 4/0 MULTIF AG. CILIND. 17 MM 1/2 C (UNID)',
with quantity 288, unit UN
sentences:
- 'Item Description: ''2202000270 - FIO SUT. AC. POLIGLIC. ABS. RÁPIDA 4/0 MULTIF
AG. CILIND. 17 MM 1/2 C (UNID)'', with quantity 288, unit UN, priced at 66.00
EUR, Origin: National'
- 'Product: ''2202000285 - FIO SUT. POLIPROPI. NÃO ABS. 4/0 MONOF. AG. LANC. 16
MM 3/8 (UNID)'', with quantity 468, unit UN, Estimated Value: 619.00 EUR'
- 'Item Description: ''Carro transporte de roupa limpa/roupa suja'', with quantity
1, unit Subcontracting Unit, priced at 574.00 EUR, Origin: National'
- source_sentence: Collect the details that are associated with product '' '2202000006
- FIO SUT. SEDA NÃO ABS. 0 MULTIF. SEM AGULHA (CART.)', with quantity 72, unit
UN
sentences:
- 'Item Description: ''2202000309 - FIO SUT. ABS. MÉDIO PRAZO 2/0 MONOF. BARBADO,
C/ AG. CILIND. 30MM 1/2C, 23CM (CART.)'', with quantity 24, unit UN, priced at
206.00 EUR, Origin: National'
- "Details for '2202000006 - FIO SUT. SEDA NÃO ABS. 0 MULTIF. SEM AGULHA (CART.)',\
\ with quantity 72, unit UN:\n - LOTE 82\n - Price: 854.00 EUR"
- 'LOTE 10
Description: ''Mesas apoio (anestesia e circulante)'', with quantity 4, unit Subcontracting
Unit
Price: 117.00 EUR'
- source_sentence: Collect the details that are associated with product '' '2202000251
- FIO SUT. ABS. LONGA 1 MONOF. AG. CILIND. 48 MM 1/2C 90CM (CART.)', with quantity
144, unit UN
sentences:
- "Details for UNKNOWN PRODUCT:\n - LOTE 34\n - Price: 477.00 EUR"
- "Details for '2202000251 - FIO SUT. ABS. LONGA 1 MONOF. AG. CILIND. 48 MM 1/2C\
\ 90CM (CART.)', with quantity 144, unit UN:\n - LOTE 73\n - Price: 644.00 EUR"
- 'Item Description: ''Mesas de Mayo'', with quantity 2, unit Subcontracting Unit,
priced at 651.00 EUR, Origin: National'
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
- pearson_cosine
- spearman_cosine
model-index:
- name: SentenceTransformer based on nomic-ai/nomic-embed-text-v1
results:
- task:
type: semantic-similarity
name: Semantic Similarity
dataset:
name: Unknown
type: unknown
metrics:
- type: pearson_cosine
value: .nan
name: Pearson Cosine
- type: spearman_cosine
value: .nan
name: Spearman Cosine
---
# SentenceTransformer based on nomic-ai/nomic-embed-text-v1
This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [nomic-ai/nomic-embed-text-v1](https://huggingface.co/nomic-ai/nomic-embed-text-v1). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
## Model Details
### Model Description
- **Model Type:** Sentence Transformer
- **Base model:** [nomic-ai/nomic-embed-text-v1](https://huggingface.co/nomic-ai/nomic-embed-text-v1)
- **Maximum Sequence Length:** 8192 tokens
- **Output Dimensionality:** 768 dimensions
- **Similarity Function:** Cosine Similarity
### Model Sources
- **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
- **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
- **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
### Full Model Architecture
```
SentenceTransformer(
(0): Transformer({'max_seq_length': 8192, 'do_lower_case': False}) with Transformer model: NomicBertModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
```
## Usage
### Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
```bash
pip install -U sentence-transformers
```
Then you can load this model and run inference.
```python
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("ptpedroVortal/nomic_vortal_v3.4")
# Run inference
sentences = [
"Collect the details that are associated with product '' '2202000251 - FIO SUT. ABS. LONGA 1 MONOF. AG. CILIND. 48 MM 1/2C 90CM (CART.)', with quantity 144, unit UN",
"Details for '2202000251 - FIO SUT. ABS. LONGA 1 MONOF. AG. CILIND. 48 MM 1/2C 90CM (CART.)', with quantity 144, unit UN:\n - LOTE 73\n - Price: 644.00 EUR",
"Item Description: 'Mesas de Mayo', with quantity 2, unit Subcontracting Unit, priced at 651.00 EUR, Origin: National",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
```
## Evaluation
### Metrics
#### Semantic Similarity
* Evaluated with __main__.CustomEvaluator
| Metric | Value |
|:--------------------|:--------|
| pearson_cosine | nan |
| **spearman_cosine** | **nan** |
## Training Details
### Training Dataset
#### Unnamed Dataset
* Size: 2,382 training samples
* Columns: query
, correct_node
, and score
* Approximate statistics based on the first 1000 samples:
| | query | correct_node | score |
|:--------|:-----------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|:-----------------------------|
| type | string | string | int |
| details |
Collect the details that are associated with product '' '2202000275 - FIO SUT. POLIAMIDA NÃO ABS. 2/0 MONOF AG. CILIND. 30MM 1/2 LOOP (UNID)', with quantity 216, unit UN
| LOTE 98
Description: '2202000275 - FIO SUT. POLIAMIDA NÃO ABS. 2/0 MONOF AG. CILIND. 30MM 1/2 LOOP (UNID)', with quantity 216, unit UN
Price: 940.00 EUR
| 1
|
| Collect the details that are associated with product '' '2202000294 - FIO SUT. AC. POLIGLIC. ABS. 2/0 MULTIF SEM AGULHA PRÉ CORTADO (UNID)', with quantity 324, unit UN
| Product: '2202000294 - FIO SUT. AC. POLIGLIC. ABS. 2/0 MULTIF SEM AGULHA PRÉ CORTADO (UNID)', with quantity 324, unit UN, Estimated Value: 696.00 EUR
| 1
|
| Collect the details that are associated with Lot 4 product '' 'Mesas de Mayo', with quantity 2, unit Subcontracting Unit
| LOTE 44
Description: 'Mesas de Mayo', with quantity 2, unit Subcontracting Unit
Price: 542.00 EUR
| 1
|
* Loss: [MultipleNegativesRankingLoss
](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
```json
{
"scale": 20.0,
"similarity_fct": "cos_sim"
}
```
### Evaluation Dataset
#### Unnamed Dataset
* Size: 297 evaluation samples
* Columns: query
, correct_node
, and score
* Approximate statistics based on the first 297 samples:
| | query | correct_node | score |
|:--------|:------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|:-----------------------------|
| type | string | string | int |
| details | Collect the details that are associated with Lot 7 product '' 'Carro transporte de roupa suja', with quantity 1, unit Subcontracting Unit
| Item Description: 'Carro transporte de roupa suja', with quantity 1, unit Subcontracting Unit, priced at 628.00 EUR, Origin: National
| 1
|
| Collect the details that are associated with Lot 10 product '' 'Mesas para cirurgia', with quantity 2, unit Subcontracting Unit
| Details for 'Mesas para cirurgia', with quantity 2, unit Subcontracting Unit:
- LOTE 83
- Price: 940.00 EUR
| 1
|
| Collect the details that are associated with Lot 1 product '' 'PAINEL MULTIPLO ALERGENOS RESPIRATORIOS ', with quantity 1152, unit UND
| Product: 'PAINEL MULTIPLO ALERGENOS RESPIRATORIOS ', with quantity 1152, unit UND, Estimated Value: 714.00 EUR
| 1
|
* Loss: [MultipleNegativesRankingLoss
](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
```json
{
"scale": 20.0,
"similarity_fct": "cos_sim"
}
```
### Training Hyperparameters
#### Non-Default Hyperparameters
- `eval_strategy`: steps
- `per_device_train_batch_size`: 16
- `per_device_eval_batch_size`: 16
- `num_train_epochs`: 10
- `warmup_ratio`: 0.1
- `bf16`: True
- `load_best_model_at_end`: True
- `batch_sampler`: no_duplicates
#### All Hyperparameters