SetFit with klue/roberta-base

This is a SetFit model that can be used for Text Classification. This SetFit model uses klue/roberta-base as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

Fine-tuning a Sentence Transformer with contrastive learning.
Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Type: SetFit
Sentence Transformer body: klue/roberta-base
Classification head: a LogisticRegression instance
Maximum Sequence Length: 512 tokens
Number of Classes: 4 classes

Model Sources

Repository: SetFit on GitHub
Paper: Efficient Few-Shot Learning Without Prompts
Blogpost: SetFit: Efficient Few-Shot Learning Without Prompts

Model Labels

Label	Examples
1.0	'갤러리아 GUESS Jeans S/S [공용] NO1D0023 M톤 슬림 와이드 미디엄블루_28 갤러리아백화점' '[현대백화점][헤지스남성] 케이블 울 하프 집업 니트 HZSW3D326G2 [00004] 그레이(G2)/110 (주)현대홈쇼핑' '데일리 플랩 항공 점퍼BK BK_110 (주) 패션플러스'
2.0	'스파오 산리오캐릭터즈 수면잠옷BLACKSPPPD4TU03 SPPPD4TU03 19 BLACK_L 100 시그마인터내셔널' 'BYC여성 순면내복내의 베이직여상하2호 BYT6656 베이직여상하_인디안핑크_90 세종유통' 'BYT3842 BYC 데오니아 심플 순면 여자 끈 나시 런닝 검정색_100 에이치앤비 주식회사'
3.0	'[켄지 24SS 최신상] ○ 24SS 오가닉 코튼 100 니트 4종 105 ' '[갤러리아] 울 아가일 배색 가디건(한화갤러리아㈜ 센터시티) 라이트그레이LG82020_66 한화갤러리아(주)' '오우오벨리SET / W3F91ST03 핑크_FR 주식회사 에스에스지닷컴'
0.0	'[현대백화점]엘르이너웨어_ EBMRN713BK 모달에어로웜와플 남런닝BK 95 (주)현대백화점' '비너스(정상) 비너스 면 80수 이합 지그재그 나염 남성 런닝 트렁크 세트_A VMV41 블루(BU)/100_필수선택 (주) 패션플러스' 'JHMRU007 제임스딘 순면 V넥 남성 민소매 머슬 런닝 2_110 도도shop'

Evaluation

Metrics

Label	Metric
all	0.8999

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("mini1013/master_item_ap")
# Run inference
preds = model("언더아머 야구 점퍼 1375292-400 S 슈즈스타11")

Training Details

Training Set Metrics

Training set	Min	Median	Max
Word count	3	9.6403	24

Label	Training Sample Count
0.0	300
1.0	809
2.0	457
3.0	1050

Training Hyperparameters

batch_size: (512, 512)
num_epochs: (20, 20)
max_steps: -1
sampling_strategy: oversampling
num_iterations: 40
body_learning_rate: (2e-05, 2e-05)
head_learning_rate: 2e-05
loss: CosineSimilarityLoss
distance_metric: cosine_distance
margin: 0.25
end_to_end: False
use_amp: False
warmup_proportion: 0.1
seed: 42
eval_max_steps: -1
load_best_model_at_end: False

Training Results

Epoch	Step	Training Loss	Validation Loss
0.0024	1	0.4029	-
0.1222	50	0.3584	-
0.2445	100	0.2822	-
0.3667	150	0.2453	-
0.4890	200	0.1961	-
0.6112	250	0.1677	-
0.7335	300	0.1175	-
0.8557	350	0.0615	-
0.9780	400	0.0308	-
1.1002	450	0.0218	-
1.2225	500	0.0133	-
1.3447	550	0.0058	-
1.4670	600	0.0101	-
1.5892	650	0.002	-
1.7115	700	0.0022	-
1.8337	750	0.0023	-
1.9560	800	0.0041	-
2.0782	850	0.0057	-
2.2005	900	0.0001	-
2.3227	950	0.0029	-
2.4450	1000	0.0032	-
2.5672	1050	0.004	-
2.6895	1100	0.0021	-
2.8117	1150	0.0033	-
2.9340	1200	0.002	-
3.0562	1250	0.002	-
3.1785	1300	0.0019	-
3.3007	1350	0.0	-
3.4230	1400	0.0019	-
3.5452	1450	0.0	-
3.6675	1500	0.0039	-
3.7897	1550	0.0	-
3.9120	1600	0.0	-
4.0342	1650	0.0002	-
4.1565	1700	0.0049	-
4.2787	1750	0.002	-
4.4010	1800	0.0	-
4.5232	1850	0.0026	-
4.6455	1900	0.0	-
4.7677	1950	0.0	-
4.8900	2000	0.0001	-
5.0122	2050	0.002	-
5.1345	2100	0.002	-
5.2567	2150	0.0	-
5.3790	2200	0.0	-
5.5012	2250	0.0	-
5.6235	2300	0.0	-
5.7457	2350	0.0004	-
5.8680	2400	0.0019	-
5.9902	2450	0.0018	-
6.1125	2500	0.0	-
6.2347	2550	0.0	-
6.3570	2600	0.0	-
6.4792	2650	0.0	-
6.6015	2700	0.002	-
6.7237	2750	0.0009	-
6.8460	2800	0.0	-
6.9682	2850	0.0015	-
7.0905	2900	0.0001	-
7.2127	2950	0.0001	-
7.3350	3000	0.002	-
7.4572	3050	0.0001	-
7.5795	3100	0.0001	-
7.7017	3150	0.0019	-
7.8240	3200	0.0019	-
7.9462	3250	0.0	-
8.0685	3300	0.0001	-
8.1907	3350	0.0038	-
8.3130	3400	0.0	-
8.4352	3450	0.0018	-
8.5575	3500	0.0	-
8.6797	3550	0.0019	-
8.8020	3600	0.0	-
8.9242	3650	0.0	-
9.0465	3700	0.0	-
9.1687	3750	0.0	-
9.2910	3800	0.0	-
9.4132	3850	0.0001	-
9.5355	3900	0.0	-
9.6577	3950	0.0019	-
9.7800	4000	0.0019	-
9.9022	4050	0.0	-
10.0244	4100	0.0001	-
10.1467	4150	0.0	-
10.2689	4200	0.002	-
10.3912	4250	0.0	-
10.5134	4300	0.0	-
10.6357	4350	0.0	-
10.7579	4400	0.0	-
10.8802	4450	0.0	-
11.0024	4500	0.0	-
11.1247	4550	0.0018	-
11.2469	4600	0.0	-
11.3692	4650	0.0	-
11.4914	4700	0.0	-
11.6137	4750	0.0	-
11.7359	4800	0.0019	-
11.8582	4850	0.001	-
11.9804	4900	0.0	-
12.1027	4950	0.0001	-
12.2249	5000	0.0	-
12.3472	5050	0.0	-
12.4694	5100	0.0	-
12.5917	5150	0.0	-
12.7139	5200	0.0	-
12.8362	5250	0.0	-
12.9584	5300	0.0	-
13.0807	5350	0.0001	-
13.2029	5400	0.0001	-
13.3252	5450	0.0	-
13.4474	5500	0.0001	-
13.5697	5550	0.0	-
13.6919	5600	0.0	-
13.8142	5650	0.0	-
13.9364	5700	0.0	-
14.0587	5750	0.0001	-
14.1809	5800	0.0	-
14.3032	5850	0.0	-
14.4254	5900	0.0	-
14.5477	5950	0.0	-
14.6699	6000	0.0	-
14.7922	6050	0.0	-
14.9144	6100	0.0	-
15.0367	6150	0.0	-
15.1589	6200	0.0	-
15.2812	6250	0.0	-
15.4034	6300	0.0	-
15.5257	6350	0.0	-
15.6479	6400	0.0	-
15.7702	6450	0.0	-
15.8924	6500	0.0	-
16.0147	6550	0.0	-
16.1369	6600	0.0	-
16.2592	6650	0.0	-
16.3814	6700	0.0	-
16.5037	6750	0.0	-
16.6259	6800	0.0	-
16.7482	6850	0.0	-
16.8704	6900	0.0	-
16.9927	6950	0.0	-
17.1149	7000	0.0	-
17.2372	7050	0.0	-
17.3594	7100	0.0	-
17.4817	7150	0.0	-
17.6039	7200	0.0	-
17.7262	7250	0.0	-
17.8484	7300	0.0	-
17.9707	7350	0.0	-
18.0929	7400	0.0	-
18.2152	7450	0.0	-
18.3374	7500	0.0	-
18.4597	7550	0.0	-
18.5819	7600	0.0	-
18.7042	7650	0.0	-
18.8264	7700	0.0	-
18.9487	7750	0.0	-
19.0709	7800	0.0	-
19.1932	7850	0.0	-
19.3154	7900	0.0	-
19.4377	7950	0.0	-
19.5599	8000	0.0	-
19.6822	8050	0.0	-
19.8044	8100	0.0	-
19.9267	8150	0.0	-

Framework Versions

Python: 3.10.12
SetFit: 1.1.0.dev0
Sentence Transformers: 3.1.1
Transformers: 4.46.1
PyTorch: 2.4.0+cu121
Datasets: 2.20.0
Tokenizers: 0.20.0

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}

mini1013
/

master_item_ap