metadata

language:
  - 'no'
license: apache-2.0
tags:
  - audio
  - asr
  - automatic-speech-recognition
  - hf-asr-leaderboard
model-index:
  - name: scream_duodevicesimus_working_noaudiobooks_7e5_v2
    results: []

scream_duodevicesimus_working_noaudiobooks_7e5_v2

This model is a fine-tuned version of openai/whisper-small on the NbAiLab/ncc_speech dataset.

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 7e-05
lr_scheduler_type: linear
per_device_train_batch_size: 32
total_train_batch_size_per_node: 128
total_train_batch_size: 1024
total_optimization_steps: 20,000
starting_optimization_step: None
finishing_optimization_step: 20,000
num_train_dataset_workers: 32
num_hosts: 8
total_num_training_examples: 20,480,000
steps_per_epoch: To be computed after first epoch
num_beams: 5
dropout: True
bpe_dropout_probability: 0.1
activation_dropout_probability: 0.1

Training results

step	validation_fleurs_loss	train_loss	validation_fleurs_wer	validation_fleurs_cer	validation_fleurs_exact_wer	validation_fleurs_exact_cer	validation_stortinget_loss	validation_stortinget_wer	validation_stortinget_cer	validation_stortinget_exact_wer	validation_stortinget_exact_cer	validation_nrk_tv_loss	validation_nrk_tv_wer	validation_nrk_tv_cer	validation_nrk_tv_exact_wer	validation_nrk_tv_exact_cer
0	1.3211	3.0189	110.1725	80.3659	196.8041	131.4230	1.5012	76.6096	51.2561	82.1890	54.4126	1.8187	259.8656	217.2117	269.5665	222.7746
1000	0.6977	1.1353	13.4444	4.3105	17.5926	5.2863	0.4717	21.7105	13.9604	25.3783	14.6687	0.9934	86.4845	70.4142	93.7677	73.6462
2000	0.3926	0.8912	10.5889	3.7088	14.7849	4.6968	0.3930	18.7212	12.5960	22.2213	13.2354	0.8926	49.9691	39.8385	57.6635	41.2514
3000	0.3620	0.8106	10.7674	4.3007	15.0836	5.2573	0.3632	17.5019	11.9674	21.0430	12.5977	0.8606	44.9157	34.5510	52.6419	35.8510
4000	0.3363	0.8043	10.3807	3.8518	14.0980	4.7886	0.3443	16.1694	11.2786	19.6917	11.8983	0.8431	44.9487	34.0425	52.5379	35.4061
5000	0.3060	0.7682	9.6074	3.6694	13.8590	4.5808	0.3329	16.0903	11.1667	19.5724	11.7732	0.8154	45.4598	35.0224	52.7292	36.3997
6000	0.3477	0.7510	9.2207	3.5510	13.3214	4.5083	0.3246	15.9711	11.2829	19.4232	11.8775	0.8097	43.0897	33.1321	50.5325	34.3331
7000	0.3152	0.7608	9.6074	4.1034	13.7395	5.0834	0.3217	15.1188	10.6651	18.5510	11.2540	0.7959	42.0139	32.2852	49.4716	33.4915
8000	0.3232	0.7680	9.8453	3.9258	13.7993	4.8128	0.3161	15.1877	10.7202	18.5356	11.2959	0.7938	42.1829	32.6832	49.6256	34.2256

Framework versions

Transformers 4.31.0.dev0
Datasets 2.13.0
Tokenizers 0.13.3