distilbert-base-uncased-banking77-classification
This model is a fine-tuned version of distilbert-base-uncased on the banking77 dataset. It achieves the following results on the evaluation set:
- Loss: 0.3152
- Accuracy: 0.9240
- F1 Score: 0.9243
Model description
This is my first fine-tuning experiment using Hugging Face. Using distilBERT as a pretrained model, I trained a classifier for online banking queries. It could be useful for addressing tickets.
Intended uses & limitations
The model can be used on text classification. In particular is fine tuned on banking domain.
Training and evaluation data
The dataset used is banking77
The 77 labels are:
label | intent |
---|---|
0 | activate_my_card |
1 | age_limit |
2 | apple_pay_or_google_pay |
3 | atm_support |
4 | automatic_top_up |
5 | balance_not_updated_after_bank_transfer |
6 | balance_not_updated_after_cheque_or_cash_deposit |
7 | beneficiary_not_allowed |
8 | cancel_transfer |
9 | card_about_to_expire |
10 | card_acceptance |
11 | card_arrival |
12 | card_delivery_estimate |
13 | card_linking |
14 | card_not_working |
15 | card_payment_fee_charged |
16 | card_payment_not_recognised |
17 | card_payment_wrong_exchange_rate |
18 | card_swallowed |
19 | cash_withdrawal_charge |
20 | cash_withdrawal_not_recognised |
21 | change_pin |
22 | compromised_card |
23 | contactless_not_working |
24 | country_support |
25 | declined_card_payment |
26 | declined_cash_withdrawal |
27 | declined_transfer |
28 | direct_debit_payment_not_recognised |
29 | disposable_card_limits |
30 | edit_personal_details |
31 | exchange_charge |
32 | exchange_rate |
33 | exchange_via_app |
34 | extra_charge_on_statement |
35 | failed_transfer |
36 | fiat_currency_support |
37 | get_disposable_virtual_card |
38 | get_physical_card |
39 | getting_spare_card |
40 | getting_virtual_card |
41 | lost_or_stolen_card |
42 | lost_or_stolen_phone |
43 | order_physical_card |
44 | passcode_forgotten |
45 | pending_card_payment |
46 | pending_cash_withdrawal |
47 | pending_top_up |
48 | pending_transfer |
49 | pin_blocked |
50 | receiving_money |
51 | Refund_not_showing_up |
52 | request_refund |
53 | reverted_card_payment? |
54 | supported_cards_and_currencies |
55 | terminate_account |
56 | top_up_by_bank_transfer_charge |
57 | top_up_by_card_charge |
58 | top_up_by_cash_or_cheque |
59 | top_up_failed |
60 | top_up_limits |
61 | top_up_reverted |
62 | topping_up_by_card |
63 | transaction_charged_twice |
64 | transfer_fee_charged |
65 | transfer_into_account |
66 | transfer_not_received_by_recipient |
67 | transfer_timing |
68 | unable_to_verify_identity |
69 | verify_my_identity |
70 | verify_source_of_funds |
71 | verify_top_up |
72 | virtual_card_not_working |
73 | visa_or_mastercard |
74 | why_verify_identity |
75 | wrong_amount_of_cash_received |
76 | wrong_exchange_rate_for_cash_withdrawal |
Training procedure
from transformers import pipeline
pipe = pipeline("text-classification", model="nickprock/distilbert-base-uncased-banking77-classification")
pipe("I can't pay by my credit card")
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 64
- eval_batch_size: 64
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 20
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy | F1 Score |
---|---|---|---|---|---|
3.8732 | 1.0 | 157 | 3.1476 | 0.5370 | 0.4881 |
2.5598 | 2.0 | 314 | 1.9780 | 0.6916 | 0.6585 |
1.5863 | 3.0 | 471 | 1.2239 | 0.8042 | 0.7864 |
0.9829 | 4.0 | 628 | 0.8067 | 0.8565 | 0.8487 |
0.6274 | 5.0 | 785 | 0.5837 | 0.8799 | 0.8752 |
0.4304 | 6.0 | 942 | 0.4630 | 0.9042 | 0.9040 |
0.3106 | 7.0 | 1099 | 0.3982 | 0.9088 | 0.9087 |
0.2238 | 8.0 | 1256 | 0.3587 | 0.9110 | 0.9113 |
0.1708 | 9.0 | 1413 | 0.3351 | 0.9208 | 0.9208 |
0.1256 | 10.0 | 1570 | 0.3242 | 0.9179 | 0.9182 |
0.0981 | 11.0 | 1727 | 0.3136 | 0.9211 | 0.9214 |
0.0745 | 12.0 | 1884 | 0.3151 | 0.9211 | 0.9213 |
0.0601 | 13.0 | 2041 | 0.3089 | 0.9218 | 0.9220 |
0.0482 | 14.0 | 2198 | 0.3158 | 0.9214 | 0.9216 |
0.0402 | 15.0 | 2355 | 0.3126 | 0.9224 | 0.9226 |
0.0344 | 16.0 | 2512 | 0.3143 | 0.9231 | 0.9233 |
0.0298 | 17.0 | 2669 | 0.3156 | 0.9231 | 0.9233 |
0.0272 | 18.0 | 2826 | 0.3134 | 0.9244 | 0.9247 |
0.0237 | 19.0 | 2983 | 0.3156 | 0.9244 | 0.9246 |
0.0229 | 20.0 | 3140 | 0.3152 | 0.9240 | 0.9243 |
Framework versions
- Transformers 4.20.1
- Pytorch 1.12.0+cu113
- Datasets 2.3.2
- Tokenizers 0.12.1
- Downloads last month
- 123
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the model is not deployed on the HF Inference API.
Model tree for nickprock/distilbert-base-uncased-banking77-classification
Base model
distilbert/distilbert-base-uncasedDataset used to train nickprock/distilbert-base-uncased-banking77-classification
Spaces using nickprock/distilbert-base-uncased-banking77-classification 2
Evaluation results
- Accuracy on banking77self-reported0.924
- Accuracy on banking77test set self-reported0.924
- Precision Macro on banking77test set self-reported0.928
- Precision Micro on banking77test set self-reported0.924
- Precision Weighted on banking77test set self-reported0.928
- Recall Macro on banking77test set self-reported0.924
- Recall Micro on banking77test set self-reported0.924
- Recall Weighted on banking77test set self-reported0.924
- F1 Macro on banking77test set self-reported0.924
- F1 Micro on banking77test set self-reported0.924