vit-large

This model is a fine-tuned version of google/vit-large-patch16-224-in21k on the cifar100 dataset. It achieves the following results on the evaluation set:

Loss: 0.3301
Accuracy: 0.9309

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 64
eval_batch_size: 256
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 100

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
1.2884	1.0	665	0.8752	0.8834
0.7958	2.0	1330	0.4724	0.9142
0.743	3.0	1995	0.3750	0.9207
0.6935	4.0	2660	0.3198	0.9236
0.6159	5.0	3325	0.2945	0.9289
0.4423	6.0	3990	0.2876	0.925
0.5506	7.0	4655	0.2617	0.9302
0.5673	8.0	5320	0.2576	0.9324
0.4613	9.0	5985	0.2586	0.9311
0.4179	10.0	6650	0.2555	0.9285
0.4438	11.0	7315	0.2554	0.9316
0.4869	12.0	7980	0.2564	0.9298
0.4289	13.0	8645	0.2713	0.9288
0.4003	14.0	9310	0.2617	0.932
0.3227	15.0	9975	0.2567	0.9335
0.386	16.0	10640	0.2571	0.931
0.3688	17.0	11305	0.2576	0.9346
0.3985	18.0	11970	0.2532	0.9356
0.3213	19.0	12635	0.2728	0.9321
0.3046	20.0	13300	0.2702	0.9334
0.3676	21.0	13965	0.2700	0.9319
0.3329	22.0	14630	0.2720	0.9333
0.4089	23.0	15295	0.2764	0.9325
0.3196	24.0	15960	0.2735	0.9305
0.2982	25.0	16625	0.2771	0.9312
0.1884	26.0	17290	0.2943	0.9304
0.3624	27.0	17955	0.2866	0.9316
0.2957	28.0	18620	0.2708	0.932
0.3013	29.0	19285	0.2881	0.932
0.2811	30.0	19950	0.2940	0.9304
0.2031	31.0	20615	0.2802	0.9335
0.3268	32.0	21280	0.2803	0.9312
0.218	33.0	21945	0.2883	0.9307
0.217	34.0	22610	0.2866	0.9356
0.2032	35.0	23275	0.2905	0.9317
0.2539	36.0	23940	0.2818	0.9313
0.2104	37.0	24605	0.2907	0.9329
0.264	38.0	25270	0.3030	0.9298
0.3343	39.0	25935	0.3030	0.9299
0.2252	40.0	26600	0.2960	0.9313
0.2453	41.0	27265	0.2977	0.9302
0.2467	42.0	27930	0.3034	0.9293
0.2208	43.0	28595	0.3022	0.9316
0.1808	44.0	29260	0.3067	0.9304
0.2477	45.0	29925	0.3073	0.9289
0.2059	46.0	30590	0.3010	0.931
0.2156	47.0	31255	0.2920	0.9318
0.2719	48.0	31920	0.3057	0.9311
0.2156	49.0	32585	0.3127	0.9292
0.2562	50.0	33250	0.3115	0.93
0.1847	51.0	33915	0.3058	0.9311
0.2453	52.0	34580	0.3180	0.9308
0.2763	53.0	35245	0.3076	0.932
0.1876	54.0	35910	0.3097	0.9318
0.1774	55.0	36575	0.3105	0.9321
0.2011	56.0	37240	0.3108	0.9337
0.2142	57.0	37905	0.3191	0.9312
0.1931	58.0	38570	0.3219	0.9299
0.2328	59.0	39235	0.3155	0.9316
0.145	60.0	39900	0.3216	0.9295
0.2804	61.0	40565	0.3253	0.9298
0.1696	62.0	41230	0.3086	0.9315
0.2194	63.0	41895	0.3170	0.9313
0.2297	64.0	42560	0.3231	0.9293
0.2108	65.0	43225	0.3161	0.9313
0.1696	66.0	43890	0.3269	0.929
0.1946	67.0	44555	0.3307	0.9302
0.1492	68.0	45220	0.3248	0.9296
0.223	69.0	45885	0.3316	0.9293
0.1738	70.0	46550	0.3248	0.9295
0.2251	71.0	47215	0.3297	0.9305
0.1518	72.0	47880	0.3322	0.9311
0.1914	73.0	48545	0.3263	0.931
0.2097	74.0	49210	0.3367	0.9294
0.1423	75.0	49875	0.3286	0.9299
0.1953	76.0	50540	0.3337	0.9307
0.1599	77.0	51205	0.3295	0.9313
0.2077	78.0	51870	0.3285	0.9312
0.2053	79.0	52535	0.3278	0.9309
0.1846	80.0	53200	0.3291	0.9307
0.1909	81.0	53865	0.3417	0.9291
0.1971	82.0	54530	0.3323	0.9289
0.1739	83.0	55195	0.3266	0.9323
0.1537	84.0	55860	0.3313	0.9294
0.1706	85.0	56525	0.3395	0.928
0.199	86.0	57190	0.3344	0.9303
0.2013	87.0	57855	0.3360	0.9294
0.1495	88.0	58520	0.3371	0.9307
0.1042	89.0	59185	0.3302	0.9316
0.1681	90.0	59850	0.3304	0.9295
0.1802	91.0	60515	0.3351	0.9298
0.268	92.0	61180	0.3332	0.9305
0.1807	93.0	61845	0.3300	0.9307
0.1855	94.0	62510	0.3315	0.9303
0.1747	95.0	63175	0.3324	0.9295
0.1783	96.0	63840	0.3313	0.9315
0.1256	97.0	64505	0.3327	0.9308
0.0984	98.0	65170	0.3291	0.9317
0.1525	99.0	65835	0.3307	0.9311
0.1471	100.0	66500	0.3301	0.9309

Framework versions

Transformers 4.39.3
Pytorch 2.2.2+cu118
Datasets 2.18.0
Tokenizers 0.15.2

jialicheng
/

cifar100-vit-large

vit-large

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for jialicheng/cifar100-vit-large

Evaluation results