SetFit with sentence-transformers/paraphrase-mpnet-base-v2

This is a SetFit model that can be used for Text Classification. This SetFit model uses sentence-transformers/paraphrase-mpnet-base-v2 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

Fine-tuning a Sentence Transformer with contrastive learning.
Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Type: SetFit
Sentence Transformer body: sentence-transformers/paraphrase-mpnet-base-v2
Classification head: a LogisticRegression instance
Maximum Sequence Length: 512 tokens
Number of Classes: 8 classes

Model Sources

Repository: SetFit on GitHub
Paper: Efficient Few-Shot Learning Without Prompts
Blogpost: SetFit: Efficient Few-Shot Learning Without Prompts

Model Labels

Label	Examples
RequestMoveToFloor	'Please go to the 3rd floor.' 'Can you take me to floor 5?' 'I need to go to the 8th floor.'
RequestMoveUp	'Go one floor up' 'Take me up two floors' 'Go up three floors, please'
RequestMoveDown	'Move me down one level' 'Can you take me down two floors?' 'Go down three levels'
Confirm	"Yes, that's right." 'Sure.' 'Exactly.'
RequestEmployeeLocation	'Where is Erik Velldal’s office?' 'Which floor is Andreas Austeng on?' 'Can you tell me where Birthe Soppe’s office is?'
CurrentFloor	'Which floor are we on?' 'What floor is this?' 'Are we on the 5th floor?'
Stop	'Stop the elevator.' "Wait, don't go to that floor." 'No, not that floor.'
OutOfCoverage	"What's the capital of France?" 'How many floors does this building have?' 'Can you make a phone call for me?'

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("victomoe/setfit-intent-classifier-3")
# Run inference
preds = model("Okay, go ahead.")

Training Details

Training Set Metrics

Training set	Min	Median	Max
Word count	1	5.2118	9

Label	Training Sample Count
Confirm	22
CurrentFloor	21
OutOfCoverage	22
RequestEmployeeLocation	22
RequestMoveDown	20
RequestMoveToFloor	23
RequestMoveUp	20
Stop	20

Training Hyperparameters

batch_size: (32, 32)
num_epochs: (10, 10)
max_steps: -1
sampling_strategy: oversampling
body_learning_rate: (2e-05, 1e-05)
head_learning_rate: 0.01
loss: CosineSimilarityLoss
distance_metric: cosine_distance
margin: 0.25
end_to_end: False
use_amp: False
warmup_proportion: 0.1
l2_weight: 0.01
seed: 42
eval_max_steps: -1
load_best_model_at_end: False

Training Results

Epoch	Step	Training Loss	Validation Loss
0.0013	1	0.195	-
0.0633	50	0.1877	-
0.1266	100	0.1592	-
0.1899	150	0.1141	-
0.2532	200	0.0603	-
0.3165	250	0.0283	-
0.3797	300	0.0104	-
0.4430	350	0.0043	-
0.5063	400	0.0027	-
0.5696	450	0.0021	-
0.6329	500	0.0017	-
0.6962	550	0.0015	-
0.7595	600	0.0011	-
0.8228	650	0.001	-
0.8861	700	0.0011	-
0.9494	750	0.0008	-
1.0127	800	0.0007	-
1.0759	850	0.0006	-
1.1392	900	0.0006	-
1.2025	950	0.0005	-
1.2658	1000	0.0005	-
1.3291	1050	0.0005	-
1.3924	1100	0.0004	-
1.4557	1150	0.0004	-
1.5190	1200	0.0004	-
1.5823	1250	0.0004	-
1.6456	1300	0.0004	-
1.7089	1350	0.0003	-
1.7722	1400	0.0003	-
1.8354	1450	0.0003	-
1.8987	1500	0.0003	-
1.9620	1550	0.0003	-
2.0253	1600	0.0003	-
2.0886	1650	0.0003	-
2.1519	1700	0.0003	-
2.2152	1750	0.0003	-
2.2785	1800	0.0003	-
2.3418	1850	0.0002	-
2.4051	1900	0.0002	-
2.4684	1950	0.0002	-
2.5316	2000	0.0002	-
2.5949	2050	0.0002	-
2.6582	2100	0.0002	-
2.7215	2150	0.0002	-
2.7848	2200	0.0002	-
2.8481	2250	0.0002	-
2.9114	2300	0.0002	-
2.9747	2350	0.0002	-
3.0380	2400	0.0002	-
3.1013	2450	0.0009	-
3.1646	2500	0.0003	-
3.2278	2550	0.0002	-
3.2911	2600	0.0002	-
3.3544	2650	0.0002	-
3.4177	2700	0.0002	-
3.4810	2750	0.0002	-
3.5443	2800	0.0002	-
3.6076	2850	0.0002	-
3.6709	2900	0.0002	-
3.7342	2950	0.0002	-
3.7975	3000	0.0002	-
3.8608	3050	0.0002	-
3.9241	3100	0.0001	-
3.9873	3150	0.0002	-
4.0506	3200	0.0001	-
4.1139	3250	0.0001	-
4.1772	3300	0.0001	-
4.2405	3350	0.0001	-
4.3038	3400	0.0001	-
4.3671	3450	0.0001	-
4.4304	3500	0.0005	-
4.4937	3550	0.0001	-
4.5570	3600	0.0001	-
4.6203	3650	0.0001	-
4.6835	3700	0.0001	-
4.7468	3750	0.0001	-
4.8101	3800	0.0001	-
4.8734	3850	0.0001	-
4.9367	3900	0.0001	-
5.0	3950	0.0001	-
5.0633	4000	0.0001	-
5.1266	4050	0.0001	-
5.1899	4100	0.0001	-
5.2532	4150	0.0001	-
5.3165	4200	0.0001	-
5.3797	4250	0.0001	-
5.4430	4300	0.0001	-
5.5063	4350	0.0001	-
5.5696	4400	0.0001	-
5.6329	4450	0.0001	-
5.6962	4500	0.0001	-
5.7595	4550	0.0001	-
5.8228	4600	0.0001	-
5.8861	4650	0.0001	-
5.9494	4700	0.0001	-
6.0127	4750	0.0001	-
6.0759	4800	0.0001	-
6.1392	4850	0.0001	-
6.2025	4900	0.0001	-
6.2658	4950	0.0001	-
6.3291	5000	0.0001	-
6.3924	5050	0.0001	-
6.4557	5100	0.0001	-
6.5190	5150	0.0001	-
6.5823	5200	0.0001	-
6.6456	5250	0.0001	-
6.7089	5300	0.0001	-
6.7722	5350	0.0001	-
6.8354	5400	0.0001	-
6.8987	5450	0.0001	-
6.9620	5500	0.0001	-
7.0253	5550	0.0001	-
7.0886	5600	0.0001	-
7.1519	5650	0.0001	-
7.2152	5700	0.0001	-
7.2785	5750	0.0001	-
7.3418	5800	0.0001	-
7.4051	5850	0.0001	-
7.4684	5900	0.0001	-
7.5316	5950	0.0001	-
7.5949	6000	0.0001	-
7.6582	6050	0.0001	-
7.7215	6100	0.0001	-
7.7848	6150	0.0001	-
7.8481	6200	0.0001	-
7.9114	6250	0.0001	-
7.9747	6300	0.0001	-
8.0380	6350	0.0001	-
8.1013	6400	0.0001	-
8.1646	6450	0.0001	-
8.2278	6500	0.0001	-
8.2911	6550	0.0001	-
8.3544	6600	0.0001	-
8.4177	6650	0.0001	-
8.4810	6700	0.0001	-
8.5443	6750	0.0001	-
8.6076	6800	0.0001	-
8.6709	6850	0.0001	-
8.7342	6900	0.0001	-
8.7975	6950	0.0001	-
8.8608	7000	0.0001	-
8.9241	7050	0.0001	-
8.9873	7100	0.0001	-
9.0506	7150	0.0001	-
9.1139	7200	0.0001	-
9.1772	7250	0.0001	-
9.2405	7300	0.0001	-
9.3038	7350	0.0001	-
9.3671	7400	0.0001	-
9.4304	7450	0.0001	-
9.4937	7500	0.0001	-
9.5570	7550	0.0001	-
9.6203	7600	0.0001	-
9.6835	7650	0.0001	-
9.7468	7700	0.0001	-
9.8101	7750	0.0001	-
9.8734	7800	0.0001	-
9.9367	7850	0.0001	-
10.0	7900	0.0001	-

Framework Versions

Python: 3.10.8
SetFit: 1.1.0
Sentence Transformers: 3.1.1
Transformers: 4.38.2
PyTorch: 2.1.2
Datasets: 2.17.1
Tokenizers: 0.15.0

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}

victomoe
/

setfit-intent-classifier-3