hoa-1b4_model_nmt_test

This model is a fine-tuned version of vlsp-2023-vllm/hoa-1b4 on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.0045

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 4e-05
train_batch_size: 1
eval_batch_size: 1
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 100

Training results

Training Loss	Epoch	Step	Validation Loss
No log	1.0	21	2.8255
No log	2.0	42	2.3028
No log	3.0	63	1.8727
No log	4.0	84	1.5161
No log	5.0	105	1.2181
No log	6.0	126	0.9991
No log	7.0	147	0.7980
No log	8.0	168	0.6372
No log	9.0	189	0.5075
No log	10.0	210	0.4042
No log	11.0	231	0.3321
No log	12.0	252	0.2716
No log	13.0	273	0.2143
No log	14.0	294	0.1740
No log	15.0	315	0.1397
No log	16.0	336	0.1263
No log	17.0	357	0.0990
No log	18.0	378	0.0853
No log	19.0	399	0.0678
No log	20.0	420	0.0546
No log	21.0	441	0.0476
No log	22.0	462	0.0441
No log	23.0	483	0.0367
0.7202	24.0	504	0.0292
0.7202	25.0	525	0.0241
0.7202	26.0	546	0.0227
0.7202	27.0	567	0.0207
0.7202	28.0	588	0.0186
0.7202	29.0	609	0.0168
0.7202	30.0	630	0.0139
0.7202	31.0	651	0.0126
0.7202	32.0	672	0.0113
0.7202	33.0	693	0.0113
0.7202	34.0	714	0.0107
0.7202	35.0	735	0.0099
0.7202	36.0	756	0.0087
0.7202	37.0	777	0.0085
0.7202	38.0	798	0.0080
0.7202	39.0	819	0.0077
0.7202	40.0	840	0.0072
0.7202	41.0	861	0.0071
0.7202	42.0	882	0.0070
0.7202	43.0	903	0.0068
0.7202	44.0	924	0.0064
0.7202	45.0	945	0.0063
0.7202	46.0	966	0.0061
0.7202	47.0	987	0.0061
0.0146	48.0	1008	0.0060
0.0146	49.0	1029	0.0058
0.0146	50.0	1050	0.0059
0.0146	51.0	1071	0.0067
0.0146	52.0	1092	0.0056
0.0146	53.0	1113	0.0055
0.0146	54.0	1134	0.0055
0.0146	55.0	1155	0.0053
0.0146	56.0	1176	0.0055
0.0146	57.0	1197	0.0055
0.0146	58.0	1218	0.0057
0.0146	59.0	1239	0.0053
0.0146	60.0	1260	0.0052
0.0146	61.0	1281	0.0052
0.0146	62.0	1302	0.0051
0.0146	63.0	1323	0.0050
0.0146	64.0	1344	0.0049
0.0146	65.0	1365	0.0050
0.0146	66.0	1386	0.0049
0.0146	67.0	1407	0.0049
0.0146	68.0	1428	0.0050
0.0146	69.0	1449	0.0049
0.0146	70.0	1470	0.0049
0.0146	71.0	1491	0.0048
0.0064	72.0	1512	0.0048
0.0064	73.0	1533	0.0047
0.0064	74.0	1554	0.0048
0.0064	75.0	1575	0.0048
0.0064	76.0	1596	0.0047
0.0064	77.0	1617	0.0047
0.0064	78.0	1638	0.0047
0.0064	79.0	1659	0.0047
0.0064	80.0	1680	0.0048
0.0064	81.0	1701	0.0046
0.0064	82.0	1722	0.0046
0.0064	83.0	1743	0.0046
0.0064	84.0	1764	0.0046
0.0064	85.0	1785	0.0046
0.0064	86.0	1806	0.0046
0.0064	87.0	1827	0.0046
0.0064	88.0	1848	0.0046
0.0064	89.0	1869	0.0046
0.0064	90.0	1890	0.0046
0.0064	91.0	1911	0.0045
0.0064	92.0	1932	0.0045
0.0064	93.0	1953	0.0045
0.0064	94.0	1974	0.0045
0.0064	95.0	1995	0.0045
0.0052	96.0	2016	0.0045
0.0052	97.0	2037	0.0045
0.0052	98.0	2058	0.0045
0.0052	99.0	2079	0.0045
0.0052	100.0	2100	0.0045

Framework versions

PEFT 0.8.2
Transformers 4.37.2
Pytorch 2.0.1+cu117
Datasets 2.14.5
Tokenizers 0.15.2

Kudod
/

hoa-1b4_model_nmt_test

hoa-1b4_model_nmt_test

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for Kudod/hoa-1b4_model_nmt_test

Evaluation results