alvanlii's picture
Added datasets a new training data
d7d406e
|
raw
history blame
2.52 kB
metadata
language:
  - zh
license: apache-2.0
tags:
  - whisper-event
  - generated_from_trainer
datasets:
  - mozilla-foundation/common_voice_11_0
model-index:
  - name: Whisper Small zh-HK - Alvin
    results:
      - task:
          name: Automatic Speech Recognition
          type: automatic-speech-recognition
        dataset:
          name: mozilla-foundation/common_voice_11_0 zh-HK
          type: mozilla-foundation/common_voice_11_0
          config: zh-HK
          split: test
          args: zh-HK
        metrics:
          - name: Cer
            type: cer
            value: 11.76

Whisper Small zh-HK - Alvin

This model is a fine-tuned version of openai/whisper-small on the Common Voice 11.0 dataset.

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

For training, three datasets were used:

  • Common Voice 11 Canto Train Set
  • CantoMap: Winterstein, Grégoire, Tang, Carmen and Lai, Regine (2020) "CantoMap: a Hong Kong Cantonese MapTask Corpus", in Proceedings of The 12th Language Resources and Evaluation Conference, Marseille: European Language Resources Association, p. 2899-2906.
  • Cantonse-ASR: Yu, Tiezheng, Frieske, Rita, Xu, Peng, Cahyawijaya, Samuel, Yiu, Cheuk Tung, Lovenia, Holy, Dai, Wenliang, Barezi, Elham, Chen, Qifeng, Ma, Xiaojuan, Shi, Bertram, Fung, Pascale (2022) "Automatic Speech Recognition Datasets in Cantonese: A Survey and New Dataset", 2022. Link: https://arxiv.org/pdf/2201.02419.pdf

Training procedure

Training Hyperparameters

  • learning_rate: 1e-5
  • train_batch_size: 16 (on 2 GPUs)
  • eval_batch_size: 8
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 16x2x2=64
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • training_steps: 5000
  • mixed_precision_training: Native AMP

Training Results

Training Loss Epoch Step Validation Loss Cer
0.1106 0.66 1000 0.3294 14.638
0.0546 1.33 2000 0.2887 12.119
0.0293 2.01 3000 0.2727 11.646
0.0214 2.66 4000 0.2741 11.760
xx xx 5000 xx xx

Framework versions