metadata

license: cc-by-nc-4.0
base_model: nguyenvulebinh/wav2vec2-base-vietnamese-250h
tags:
  - generated_from_trainer
datasets:
  - common_voice_11_0
metrics:
  - wer
model-index:
  - name: model_weight_with_token_110
    results:
      - task:
          name: Automatic Speech Recognition
          type: automatic-speech-recognition
        dataset:
          name: common_voice_11_0
          type: common_voice_11_0
          config: vi
          split: None
          args: vi
        metrics:
          - name: Wer
            type: wer
            value: 0.17328485312410297

model_weight_with_token_110

This model is a fine-tuned version of nguyenvulebinh/wav2vec2-base-vietnamese-250h on the common_voice_11_0 dataset. It achieves the following results on the evaluation set:

Loss: 0.0688
Wer: 0.1733

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 32
eval_batch_size: 8
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 1000
num_epochs: 40
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Wer
0.5366	1.3928	500	0.1234	0.2107
0.4976	2.7855	1000	0.1343	0.2133
0.4734	4.1783	1500	0.1109	0.2037
0.4449	5.5710	2000	0.1111	0.2061
0.4194	6.9638	2500	0.1096	0.2024
0.3941	8.3565	3000	0.1231	0.1969
0.3767	9.7493	3500	0.1059	0.2002
0.3853	11.1421	4000	0.0998	0.1930
0.3584	12.5348	4500	0.0892	0.1905
0.3291	13.9276	5000	0.0926	0.1899
0.3279	15.3203	5500	0.0879	0.1878
0.3014	16.7131	6000	0.0831	0.1851
0.2886	18.1058	6500	0.0814	0.1857
0.2949	19.4986	7000	0.0880	0.1854
0.2661	20.8914	7500	0.0782	0.1829
0.2676	22.2841	8000	0.0789	0.1806
0.2663	23.6769	8500	0.0787	0.1805
0.2461	25.0696	9000	0.0788	0.1793
0.2484	26.4624	9500	0.0755	0.1804
0.2452	27.8552	10000	0.0715	0.1773
0.2261	29.2479	10500	0.0705	0.1764
0.2311	30.6407	11000	0.0757	0.1770
0.2195	32.0334	11500	0.0714	0.1763
0.2208	33.4262	12000	0.0697	0.1752
0.2029	34.8189	12500	0.0673	0.1744
0.2228	36.2117	13000	0.0691	0.1739
0.2056	37.6045	13500	0.0678	0.1738
0.2017	38.9972	14000	0.0688	0.1733

Framework versions

Transformers 4.40.2
Pytorch 2.2.1+cu121
Datasets 2.19.1
Tokenizers 0.19.1