SetFit with BAAI/bge-small-en-v1.5

This is a SetFit model that can be used for Text Classification. This SetFit model uses BAAI/bge-small-en-v1.5 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

  1. Fine-tuning a Sentence Transformer with contrastive learning.
  2. Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Sources

Model Labels

Label Examples
English
  • "Can you tell me about your favorite book? I love 'Harry Potter' because it's full of magic and adventure."
  • 'What did you learn about poems today? We learned about rhymes and how they create a rhythm in poems.'
  • "Can you make a sentence using the word 'enigmatic'? The old man's smile was enigmatic, making me wonder what secrets he hid."
Math
  • "What is 8 times 9? It's 72."
  • 'How do you find the area of a rectangle? Multiply the length by the width.'
  • "What's the difference between a prime number and a composite number? A prime number has only two factors, 1 and itself, while a composite number has more than two factors."
Art
  • 'What colors do you mix to make green? Yellow and blue make green.'
  • 'Who painted the Mona Lisa? Leonardo da Vinci painted it.'
  • "What's the difference between sculpture and pottery? Sculpture is the art of making figures while pottery is specifically making vessels from clay."
Science
  • "What is photosynthesis? It's the process by which plants make their food using sunlight."
  • 'Can you name the planets in our solar system? Mercury, Venus, Earth, Mars, Jupiter, Saturn, Uranus, and Neptune.'
  • "What's the difference between a solid and a liquid? A solid has a fixed shape while a liquid takes the shape of its container."
History
  • 'Who was the first president of the United States? George Washington was the first president.'
  • 'Can you tell me about the Egyptian pyramids? They were massive tombs built for pharaohs, the biggest is the Pyramid of Giza.'
  • 'What was the Renaissance? It was a period of great cultural and scientific advancement in Europe.'
Technology
  • "What is the Internet? It's a global network of computers that can share information."
  • 'Can you name a famous computer scientist? Alan Turing is known as one of the fathers of computer science.'
  • "What does 'AI' stand for? It stands for Artificial Intelligence."
NONE
  • 'What did you have for lunch today? I had a sandwich and some fruit.'
  • 'Do you like playing outside? Yes, I love playing soccer with my friends.'
  • "What's your favorite TV show? I love watching 'SpongeBob SquarePants'."

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("bew/setfit-subject-model-basic")
# Run inference
preds = model("Who was Cleopatra? She was a queen of ancient Egypt.")

Training Details

Training Set Metrics

Training set Min Median Max
Word count 6 14.1333 30
Label Training Sample Count
Art 10
English 10
History 10
Math 10
NONE 15
Science 10
Technology 10

Training Hyperparameters

  • batch_size: (32, 32)
  • num_epochs: (10, 10)
  • max_steps: -1
  • sampling_strategy: oversampling
  • body_learning_rate: (2e-05, 1e-05)
  • head_learning_rate: 0.01
  • loss: CosineSimilarityLoss
  • distance_metric: cosine_distance
  • margin: 0.25
  • end_to_end: False
  • use_amp: False
  • warmup_proportion: 0.1
  • seed: 42
  • eval_max_steps: -1
  • load_best_model_at_end: False

Training Results

Epoch Step Training Loss Validation Loss
0.0067 1 0.1987 -
0.3333 50 0.1814 -
0.6667 100 0.128 -
1.0 150 0.0146 -
1.3333 200 0.006 -
1.6667 250 0.0037 -
2.0 300 0.0031 -
2.3333 350 0.0027 -
2.6667 400 0.0024 -
3.0 450 0.0024 -
3.3333 500 0.002 -
3.6667 550 0.002 -
4.0 600 0.0017 -
4.3333 650 0.0019 -
4.6667 700 0.0018 -
5.0 750 0.0014 -
5.3333 800 0.0013 -
5.6667 850 0.0014 -
6.0 900 0.0014 -
6.3333 950 0.0014 -
6.6667 1000 0.0016 -
7.0 1050 0.0013 -
7.3333 1100 0.0013 -
7.6667 1150 0.0012 -
8.0 1200 0.0014 -
8.3333 1250 0.001 -
8.6667 1300 0.0012 -
9.0 1350 0.0014 -
9.3333 1400 0.0012 -
9.6667 1450 0.0012 -
10.0 1500 0.0011 -

Framework Versions

  • Python: 3.10.12
  • SetFit: 1.0.3
  • Sentence Transformers: 2.3.1
  • Transformers: 4.35.2
  • PyTorch: 2.1.0+cu121
  • Datasets: 2.17.0
  • Tokenizers: 0.15.2

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}
Downloads last month
14
Safetensors
Model size
33.4M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for bew/setfit-subject-model-basic

Finetuned
(136)
this model