SetFit with intfloat/e5-small-v2

This is a SetFit model that can be used for Text Classification. This SetFit model uses intfloat/e5-small-v2 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

  1. Fine-tuning a Sentence Transformer with contrastive learning.
  2. Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Sources

Model Labels

Label Examples
0
  • 'query: Oi Pedro, você viu o novo filme que estreou semana passada?'
  • 'query: Também gostei muito. Quem sabe podemos assistir juntos na próxima vez.'
  • 'query: Jeg har det godt, tak. Hvad med dig?'
1
  • 'query: Combinado! Vamos marcar um dia. Até mais!'
  • 'query: Måske. Skal vi tale om det senere?'
  • 'query: Absolument. On se voit ce soir pour fêter ça. À plus tard!'

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("setfit_model_id")
# Run inference
preds = model("query: 好的,那就先这样,李先生,再见。")

Training Details

Training Set Metrics

Training set Min Median Max
Word count 2 6.2674 18
Label Training Sample Count
0 85
1 87

Training Hyperparameters

  • batch_size: (4, 1)
  • num_epochs: (1, 1)
  • max_steps: -1
  • sampling_strategy: undersampling
  • body_learning_rate: (1e-06, 1e-06)
  • head_learning_rate: 8e-06
  • loss: CosineSimilarityLoss
  • distance_metric: cosine_distance
  • margin: 0.05
  • end_to_end: False
  • use_amp: False
  • warmup_proportion: 0.1
  • seed: 42
  • run_name: intfloat/e5-small-v2
  • eval_max_steps: -1
  • load_best_model_at_end: True

Training Results

Epoch Step Training Loss Validation Loss
0.0003 1 0.3851 -
0.0135 50 0.3455 -
0.0270 100 0.3359 0.3522
0.0406 150 0.3459 -
0.0541 200 0.3645 0.3221
0.0676 250 0.3264 -
0.0811 300 0.2955 0.2759
0.0946 350 0.2546 -
0.1082 400 0.2253 0.2373
0.1217 450 0.2004 -
0.1352 500 0.3578 0.2318
0.1487 550 0.2628 -
0.1622 600 0.2614 0.2222
0.1758 650 0.2095 -
0.1893 700 0.2345 0.2196
0.2028 750 0.1842 -
0.2163 800 0.1942 0.2326
0.2299 850 0.218 -
0.2434 900 0.3134 0.2422
0.2569 950 0.1639 -
0.2704 1000 0.2138 0.23
0.2839 1050 0.3102 -
0.2975 1100 0.1347 0.2348
0.3110 1150 0.1698 -
0.3245 1200 0.2467 0.2547
0.3380 1250 0.1064 -
0.3515 1300 0.1757 0.2383
0.3651 1350 0.1093 -
0.3786 1400 0.2869 0.2393
0.3921 1450 0.2519 -
0.4056 1500 0.2344 0.2323
0.4191 1550 0.2804 -
0.4327 1600 0.1082 0.2403
0.4462 1650 0.2025 -
0.4597 1700 0.2213 0.2547
0.4732 1750 0.1302 -
0.4867 1800 0.1517 0.2345
0.5003 1850 0.2779 -
0.5138 1900 0.1918 0.2339
0.5273 1950 0.1132 -
0.5408 2000 0.2075 0.253
0.5544 2050 0.2488 -
0.5679 2100 0.0579 0.2526
0.5814 2150 0.3789 -
0.5949 2200 0.167 0.2573
0.6084 2250 0.199 -
0.6220 2300 0.0824 0.2258
0.6355 2350 0.1396 -
0.6490 2400 0.3674 0.2527
0.6625 2450 0.2448 -
0.6760 2500 0.1623 0.249
0.6896 2550 0.2198 -
0.7031 2600 0.118 0.2613
0.7166 2650 0.1511 -
0.7301 2700 0.1162 0.2351
0.7436 2750 0.1393 -
0.7572 2800 0.1845 0.2418
0.7707 2850 0.1821 -
0.7842 2900 0.1762 0.254
0.7977 2950 0.0477 -
0.8112 3000 0.1928 0.2633
0.8248 3050 0.1363 -
0.8383 3100 0.0811 0.261
0.8518 3150 0.0734 -
0.8653 3200 0.0917 0.2202
0.8789 3250 0.3027 -
0.8924 3300 0.1528 0.2767
0.9059 3350 0.2234 -
0.9194 3400 0.1048 0.2667
0.9329 3450 0.1865 -
0.9465 3500 0.051 0.2612
0.9600 3550 0.0218 -
0.9735 3600 0.1524 0.243
0.9870 3650 0.1759 -
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.10.11
  • SetFit: 1.0.3
  • Sentence Transformers: 2.7.0
  • Transformers: 4.39.0
  • PyTorch: 2.3.1
  • Datasets: 2.20.0
  • Tokenizers: 0.15.2

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}
Downloads last month
15
Safetensors
Model size
33.4M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for thegenerativegeneration/stay_or_go_conversation_classifier_xs

Finetuned
(2)
this model