bert-large-uncased-sst-2-32-13

This model is a fine-tuned version of bert-large-uncased on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6858
  • Accuracy: 0.8906

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 50
  • num_epochs: 150

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 2 0.7579 0.4844
No log 2.0 4 0.7556 0.4844
No log 3.0 6 0.7502 0.4844
No log 4.0 8 0.7421 0.4844
0.7491 5.0 10 0.7348 0.5
0.7491 6.0 12 0.7286 0.5
0.7491 7.0 14 0.7228 0.5
0.7491 8.0 16 0.7177 0.5
0.7491 9.0 18 0.7136 0.5
0.7136 10.0 20 0.7095 0.5
0.7136 11.0 22 0.7049 0.5
0.7136 12.0 24 0.6993 0.5
0.7136 13.0 26 0.6926 0.5
0.7136 14.0 28 0.6860 0.5
0.6781 15.0 30 0.6777 0.5
0.6781 16.0 32 0.6691 0.5
0.6781 17.0 34 0.6605 0.5312
0.6781 18.0 36 0.6524 0.5156
0.6781 19.0 38 0.6428 0.5469
0.584 20.0 40 0.6239 0.6562
0.584 21.0 42 0.6124 0.6719
0.584 22.0 44 0.6053 0.6562
0.584 23.0 46 0.5981 0.6875
0.584 24.0 48 0.5707 0.7031
0.439 25.0 50 0.5284 0.7656
0.439 26.0 52 0.5125 0.7812
0.439 27.0 54 0.5117 0.75
0.439 28.0 56 0.4922 0.7656
0.439 29.0 58 0.4698 0.7812
0.2661 30.0 60 0.4417 0.7656
0.2661 31.0 62 0.4234 0.7812
0.2661 32.0 64 0.4309 0.7656
0.2661 33.0 66 0.4503 0.7812
0.2661 34.0 68 0.4344 0.8125
0.1 35.0 70 0.3772 0.8281
0.1 36.0 72 0.3475 0.875
0.1 37.0 74 0.3404 0.875
0.1 38.0 76 0.3334 0.8906
0.1 39.0 78 0.3313 0.9062
0.033 40.0 80 0.3315 0.9062
0.033 41.0 82 0.3340 0.9062
0.033 42.0 84 0.3364 0.9062
0.033 43.0 86 0.3412 0.9062
0.033 44.0 88 0.3509 0.8906
0.0142 45.0 90 0.3588 0.875
0.0142 46.0 92 0.3675 0.875
0.0142 47.0 94 0.3788 0.875
0.0142 48.0 96 0.3957 0.875
0.0142 49.0 98 0.4137 0.875
0.0081 50.0 100 0.4338 0.875
0.0081 51.0 102 0.4507 0.875
0.0081 52.0 104 0.4645 0.8906
0.0081 53.0 106 0.4767 0.8906
0.0081 54.0 108 0.4875 0.8906
0.0048 55.0 110 0.4977 0.8906
0.0048 56.0 112 0.5052 0.8906
0.0048 57.0 114 0.5082 0.8906
0.0048 58.0 116 0.5095 0.8906
0.0048 59.0 118 0.4912 0.875
0.0032 60.0 120 0.4782 0.875
0.0032 61.0 122 0.4720 0.875
0.0032 62.0 124 0.4713 0.875
0.0032 63.0 126 0.4757 0.875
0.0032 64.0 128 0.4820 0.875
0.0021 65.0 130 0.4919 0.875
0.0021 66.0 132 0.5045 0.875
0.0021 67.0 134 0.5175 0.875
0.0021 68.0 136 0.5308 0.875
0.0021 69.0 138 0.5430 0.875
0.0014 70.0 140 0.5544 0.875
0.0014 71.0 142 0.5643 0.8906
0.0014 72.0 144 0.5735 0.8906
0.0014 73.0 146 0.5810 0.8906
0.0014 74.0 148 0.5871 0.8906
0.0011 75.0 150 0.6019 0.8906
0.0011 76.0 152 0.6149 0.8906
0.0011 77.0 154 0.6262 0.8906
0.0011 78.0 156 0.6356 0.8906
0.0011 79.0 158 0.6435 0.8906
0.0007 80.0 160 0.6504 0.8906
0.0007 81.0 162 0.6568 0.8906
0.0007 82.0 164 0.6606 0.8906
0.0007 83.0 166 0.6625 0.8906
0.0007 84.0 168 0.6645 0.8906
0.0006 85.0 170 0.6663 0.8906
0.0006 86.0 172 0.6676 0.8906
0.0006 87.0 174 0.6691 0.8906
0.0006 88.0 176 0.6705 0.8906
0.0006 89.0 178 0.6717 0.8906
0.0006 90.0 180 0.6726 0.8906
0.0006 91.0 182 0.6735 0.8906
0.0006 92.0 184 0.6745 0.8906
0.0006 93.0 186 0.6756 0.8906
0.0006 94.0 188 0.6768 0.8906
0.0005 95.0 190 0.6781 0.8906
0.0005 96.0 192 0.6788 0.8906
0.0005 97.0 194 0.6791 0.8906
0.0005 98.0 196 0.6794 0.8906
0.0005 99.0 198 0.6798 0.8906
0.0004 100.0 200 0.6801 0.8906
0.0004 101.0 202 0.6805 0.8906
0.0004 102.0 204 0.6810 0.8906
0.0004 103.0 206 0.6817 0.8906
0.0004 104.0 208 0.6826 0.8906
0.0004 105.0 210 0.6833 0.8906
0.0004 106.0 212 0.6841 0.8906
0.0004 107.0 214 0.6850 0.8906
0.0004 108.0 216 0.6857 0.8906
0.0004 109.0 218 0.6866 0.8906
0.0004 110.0 220 0.6874 0.8906
0.0004 111.0 222 0.6881 0.8906
0.0004 112.0 224 0.6886 0.8906
0.0004 113.0 226 0.6889 0.8906
0.0004 114.0 228 0.6890 0.8906
0.0003 115.0 230 0.6889 0.8906
0.0003 116.0 232 0.6888 0.8906
0.0003 117.0 234 0.6886 0.8906
0.0003 118.0 236 0.6885 0.8906
0.0003 119.0 238 0.6874 0.8906
0.0003 120.0 240 0.6866 0.8906
0.0003 121.0 242 0.6860 0.8906
0.0003 122.0 244 0.6857 0.8906
0.0003 123.0 246 0.6855 0.8906
0.0003 124.0 248 0.6852 0.8906
0.0003 125.0 250 0.6850 0.8906
0.0003 126.0 252 0.6847 0.8906
0.0003 127.0 254 0.6846 0.8906
0.0003 128.0 256 0.6846 0.8906
0.0003 129.0 258 0.6846 0.8906
0.0003 130.0 260 0.6846 0.8906
0.0003 131.0 262 0.6847 0.8906
0.0003 132.0 264 0.6847 0.8906
0.0003 133.0 266 0.6848 0.8906
0.0003 134.0 268 0.6846 0.8906
0.0003 135.0 270 0.6846 0.8906
0.0003 136.0 272 0.6846 0.8906
0.0003 137.0 274 0.6846 0.8906
0.0003 138.0 276 0.6847 0.8906
0.0003 139.0 278 0.6848 0.8906
0.0003 140.0 280 0.6849 0.8906
0.0003 141.0 282 0.6851 0.8906
0.0003 142.0 284 0.6852 0.8906
0.0003 143.0 286 0.6854 0.8906
0.0003 144.0 288 0.6855 0.8906
0.0003 145.0 290 0.6855 0.8906
0.0003 146.0 292 0.6856 0.8906
0.0003 147.0 294 0.6857 0.8906
0.0003 148.0 296 0.6857 0.8906
0.0003 149.0 298 0.6858 0.8906
0.0003 150.0 300 0.6858 0.8906

Framework versions

  • Transformers 4.32.0.dev0
  • Pytorch 2.0.1+cu118
  • Datasets 2.4.0
  • Tokenizers 0.13.3
Downloads last month
8
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for simonycl/bert-large-uncased-sst-2-32-13

Finetuned
(116)
this model