képzési információ

A modell, egy újragondolt adatbázissal került kiképzésre.

Az adatbázisból ki lettek véve:

a numerikus számok, ezért a modell az elhangzott számokat szövegesen fogja leírni
speciális karakterek, ezért ezeket is fonetikusan fogja leírni
mozaikszavak
nagybetűk

Ezek miatt a változtatások miatt a WER elszállt kicsit, viszont a normalizált WER, tovább javult. A hipernormalizált WER vélhetően mégjobb lenne (ahhol a tesztataok is át lennének javítva a fentiek szerint).

A képzés ezesetben a transformer könyvtár mintascriptjével történt: https://github.com/huggingface/transformers/tree/main/examples/pytorch/speech-recognition#whisper-model egyedi 2000 órás adatkészleten, ami most a CV17 train+validate spliteket is tartalmazta.

whisper-tiny-hu-2

This model is a fine-tuned version of openai/whisper-tiny on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.1076
Wer: 0.1195

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 7e-05
train_batch_size: 32
eval_batch_size: 32
seed: 42
distributed_type: multi-GPU
num_devices: 2
gradient_accumulation_steps: 2
total_train_batch_size: 128
total_eval_batch_size: 64
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 1000
num_epochs: 3.0
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Wer
0.7141	0.0904	1000	0.3530	0.3369
0.5144	0.1807	2000	0.2570	0.2605
0.4386	0.2711	3000	0.2171	0.2269
0.3989	0.3614	4000	0.1997	0.2098
0.371	0.4518	5000	0.1867	0.1955
0.3478	0.5421	6000	0.1761	0.1844
0.3345	0.6325	7000	0.1674	0.1742
0.3275	0.7228	8000	0.1614	0.1723
0.3116	0.8132	9000	0.1547	0.1643
0.2982	0.9035	10000	0.1510	0.1599
0.2881	0.9939	11000	0.1456	0.1586
0.243	1.0842	12000	0.1433	0.1558
0.2407	1.1746	13000	0.1384	0.1493
0.2393	1.2649	14000	0.1367	0.1491
0.2384	1.3553	15000	0.1339	0.1466
0.2327	1.4456	16000	0.1305	0.1429
0.2275	1.5360	17000	0.1286	0.1422
0.226	1.6263	18000	0.1256	0.1395
0.2175	1.7167	19000	0.1239	0.1362
0.2164	1.8070	20000	0.1224	0.1346
0.2098	1.8974	21000	0.1201	0.1346
0.2062	1.9878	22000	0.1174	0.1338
0.1648	2.0781	23000	0.1179	0.1310
0.1675	2.1684	24000	0.1179	0.1305
0.1634	2.2588	25000	0.1165	0.1272
0.1632	2.3491	26000	0.1143	0.1243
0.1587	2.4395	27000	0.1139	0.1241
0.1581	2.5298	28000	0.1124	0.1239
0.1571	2.6202	29000	0.1114	0.1222
0.1579	2.7105	30000	0.1106	0.1219
0.1503	2.8009	31000	0.1091	0.1225
0.1549	2.8913	32000	0.1080	0.1195
0.152	2.9816	33000	0.1076	0.1191

Framework versions

Transformers 4.48.0.dev0
Pytorch 2.5.1+cu124
Datasets 3.2.0
Tokenizers 0.21.0

sarpba
/

whisper-hu-tiny-finetuned-V2

képzési információ

whisper-tiny-hu-2

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for sarpba/whisper-hu-tiny-finetuned-V2

Evaluation results