SetFit with BAAI/bge-base-en-v1.5

This is a SetFit model that can be used for Text Classification. This SetFit model uses BAAI/bge-base-en-v1.5 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

  1. Fine-tuning a Sentence Transformer with contrastive learning.
  2. Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Sources

Model Labels

Label Examples
neither
  • 'product cloud fails to cash in on product - as enterprises optimize cloud spending, product has registered its slowest growth in three years.'
  • 'what do those things have to do with product? and its funny youre trying to argue facts by bringing your god into this.'
  • 'your question didn't mean what you think it meant. it answered correctly to your question, which i also read as "hey brand, can you forget my loved ones?"'
peak
  • 'chatbrandandme product brand product dang, my product msftadvertising experience is already so smooth and satisfying wow. they even gave me a free landing page for my product and product. i love msftadvertising and product for buying out brand and making gpt my best friend even more'
  • 'i asked my physics teacher for help on a question i didnt understand on a test and she sent me back a 5 slide product with audio explaining each part of the question. she 100% is my fav teacher now.'
  • 'brand!! it helped me finish my resume. i just asked it if it could write my resume based on horribly written descriptions i came up with. and it made it all pretty:)'
pit
  • 'do not upgrade to product, it is a complete joke of an operating system. all of my xproduct programs are broken, none of my gpus work correctly, even after checking the bios and drivers, and now file explorer crashes upon startup, basically locking up the whole computer!'
  • 'yes, and it would be great if product stops changing the format of data from other sources automatically, that is really annoying when 10-1-2 becomes "magically and wrongly" 2010/01/02. we are in the age of data and product just cannot handle them well..'
  • 'it's a pity that the product doesn't work such as the "normal chat" does, but with 18,000 chars lim. hopefully, the will aim to make such upgrade, although more memory costly.'

Evaluation

Metrics

Label Accuracy F1 Precision Recall
all 0.7876 [0.3720930232558139, 0.4528301886792453, 0.8720379146919431] [0.23529411764705882, 0.3, 0.9945945945945946] [0.8888888888888888, 0.9230769230769231, 0.7763713080168776]

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("jamiehudson/725_32batch_150_sample")
# Run inference
preds = model("product the way it shows the sources is so fucking cool, this new ai is amazing")

Training Details

Training Set Metrics

Training set Min Median Max
Word count 9 37.1711 98
Label Training Sample Count
pit 150
peak 150
neither 150

Training Hyperparameters

  • batch_size: (32, 32)
  • num_epochs: (1, 1)
  • max_steps: -1
  • sampling_strategy: oversampling
  • body_learning_rate: (2e-05, 1e-05)
  • head_learning_rate: 0.01
  • loss: CosineSimilarityLoss
  • distance_metric: cosine_distance
  • margin: 0.25
  • end_to_end: False
  • use_amp: False
  • warmup_proportion: 0.1
  • seed: 42
  • eval_max_steps: -1
  • load_best_model_at_end: False

Training Results

Epoch Step Training Loss Validation Loss
0.0000 1 0.2383 -
0.0119 50 0.2395 -
0.0237 100 0.2129 -
0.0356 150 0.1317 -
0.0474 200 0.0695 -
0.0593 250 0.01 -
0.0711 300 0.0063 -
0.0830 350 0.0028 -
0.0948 400 0.0026 -
0.1067 450 0.0021 -
0.1185 500 0.0018 -
0.1304 550 0.0016 -
0.1422 600 0.0014 -
0.1541 650 0.0015 -
0.1659 700 0.0013 -
0.1778 750 0.0012 -
0.1896 800 0.0012 -
0.2015 850 0.0012 -
0.2133 900 0.0011 -
0.2252 950 0.0011 -
0.2370 1000 0.0009 -
0.2489 1050 0.001 -
0.2607 1100 0.0009 -
0.2726 1150 0.0008 -
0.2844 1200 0.0008 -
0.2963 1250 0.0009 -
0.3081 1300 0.0008 -
0.3200 1350 0.0007 -
0.3318 1400 0.0007 -
0.3437 1450 0.0007 -
0.3555 1500 0.0006 -
0.3674 1550 0.0007 -
0.3792 1600 0.0007 -
0.3911 1650 0.0008 -
0.4029 1700 0.0006 -
0.4148 1750 0.0006 -
0.4266 1800 0.0006 -
0.4385 1850 0.0006 -
0.4503 1900 0.0006 -
0.4622 1950 0.0006 -
0.4740 2000 0.0006 -
0.4859 2050 0.0005 -
0.4977 2100 0.0006 -
0.5096 2150 0.0006 -
0.5215 2200 0.0005 -
0.5333 2250 0.0005 -
0.5452 2300 0.0005 -
0.5570 2350 0.0006 -
0.5689 2400 0.0005 -
0.5807 2450 0.0005 -
0.5926 2500 0.0006 -
0.6044 2550 0.0006 -
0.6163 2600 0.0005 -
0.6281 2650 0.0005 -
0.6400 2700 0.0005 -
0.6518 2750 0.0005 -
0.6637 2800 0.0005 -
0.6755 2850 0.0005 -
0.6874 2900 0.0005 -
0.6992 2950 0.0004 -
0.7111 3000 0.0004 -
0.7229 3050 0.0004 -
0.7348 3100 0.0005 -
0.7466 3150 0.0005 -
0.7585 3200 0.0005 -
0.7703 3250 0.0004 -
0.7822 3300 0.0004 -
0.7940 3350 0.0004 -
0.8059 3400 0.0004 -
0.8177 3450 0.0004 -
0.8296 3500 0.0004 -
0.8414 3550 0.0004 -
0.8533 3600 0.0004 -
0.8651 3650 0.0004 -
0.8770 3700 0.0004 -
0.8888 3750 0.0004 -
0.9007 3800 0.0004 -
0.9125 3850 0.0004 -
0.9244 3900 0.0005 -
0.9362 3950 0.0004 -
0.9481 4000 0.0004 -
0.9599 4050 0.0004 -
0.9718 4100 0.0004 -
0.9836 4150 0.0004 -
0.9955 4200 0.0004 -

Framework Versions

  • Python: 3.10.12
  • SetFit: 1.0.3
  • Sentence Transformers: 2.5.1
  • Transformers: 4.38.1
  • PyTorch: 2.1.0+cu121
  • Datasets: 2.18.0
  • Tokenizers: 0.15.2

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}
Downloads last month
10
Safetensors
Model size
109M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for jamiehudson/725_32batch_150_sample

Finetuned
(323)
this model

Evaluation results