Whisper-small-ru-v2
This model is a fine-tuned version of openai/whisper-small on an Russian part of the Common Voice 15 dataset. It achieves the following results on the evaluation set:
- Loss: 0.1329
- Wer: 12.6750
- Cer: 3.7305
- Learning Rate: 0.0000
Model description
Same as openai/whisper-small.
Intended uses & limitations
Same as openai/whisper-small
Training and evaluation data
Fine-tunned on an Russian part of the Common Voice 15 dataset.
Training procedure
According to the article "Fine-Tune Whisper For Multilingual ASR with ๐ค Transformers"
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-08
- train_batch_size: 16
- eval_batch_size: 8
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 32
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 250
- training_steps: 15000
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss | Wer | Cer | Rate |
---|---|---|---|---|---|---|
0.0661 | 0.09 | 500 | 0.1358 | 12.9097 | 3.8217 | 0.0000 |
0.0616 | 0.17 | 1000 | 0.1357 | 12.9620 | 3.8949 | 0.0000 |
0.0601 | 0.26 | 1500 | 0.1357 | 12.8795 | 3.8225 | 0.0000 |
0.0666 | 0.35 | 2000 | 0.1353 | 12.9481 | 3.8871 | 0.0000 |
0.0669 | 0.43 | 2500 | 0.1352 | 12.8284 | 3.8283 | 0.0000 |
0.0665 | 0.52 | 3000 | 0.1351 | 12.8203 | 3.7833 | 0.0000 |
0.0649 | 0.61 | 3500 | 0.1349 | 12.8098 | 3.7824 | 0.0000 |
0.0607 | 0.69 | 4000 | 0.1347 | 12.8110 | 3.8105 | 0.0000 |
0.0636 | 0.78 | 4500 | 0.1345 | 12.7994 | 3.7893 | 0.0000 |
0.063 | 0.87 | 5000 | 0.1342 | 12.8319 | 3.8084 | 0.0000 |
0.0589 | 0.95 | 5500 | 0.1341 | 12.8807 | 3.8551 | 0.0000 |
0.0734 | 1.04 | 6000 | 0.1341 | 12.7691 | 3.7604 | 0.0000 |
0.0577 | 1.13 | 6500 | 0.1340 | 12.7645 | 3.7602 | 0.0000 |
0.052 | 1.21 | 7000 | 0.1340 | 12.7610 | 3.7655 | 0.0000 |
0.0626 | 1.3 | 7500 | 0.1339 | 12.7657 | 3.7593 | 0.0000 |
0.0617 | 1.39 | 8000 | 0.1338 | 12.7912 | 3.8268 | 0.0000 |
0.063 | 1.47 | 8500 | 0.1337 | 12.7343 | 3.7573 | 0.0000 |
0.0668 | 1.56 | 9000 | 0.1336 | 12.7308 | 3.7198 | 0.0000 |
0.0634 | 1.65 | 9500 | 0.1335 | 12.7215 | 3.7400 | 0.0000 |
0.0604 | 1.73 | 10000 | 0.1333 | 12.7192 | 3.7515 | 0.0000 |
0.0707 | 1.82 | 10500 | 0.1333 | 12.7052 | 3.7568 | 0.0000 |
0.0639 | 1.91 | 11000 | 0.1332 | 12.6983 | 3.7617 | 0.0000 |
0.0617 | 1.99 | 11500 | 0.1331 | 12.6936 | 3.7402 | 0.0000 |
0.0601 | 2.08 | 12000 | 0.1330 | 12.6901 | 3.7586 | 0.0000 |
0.0632 | 2.17 | 12500 | 0.1330 | 12.6785 | 3.7279 | 0.0000 |
0.0626 | 2.25 | 13000 | 0.1330 | 12.6808 | 3.7333 | 0.0000 |
0.066 | 2.34 | 13500 | 0.1329 | 12.6704 | 3.7512 | 0.0000 |
0.0674 | 2.42 | 14000 | 0.1329 | 12.6599 | 3.7384 | 0.0000 |
0.0637 | 2.51 | 14500 | 0.1329 | 12.6797 | 3.7428 | 0.0000 |
0.0641 | 2.6 | 15000 | 0.1329 | 12.6750 | 3.7305 | 0.0000 |
Framework versions
- Transformers 4.36.0.dev0
- Pytorch 2.1.1+cu121
- Datasets 2.15.0
- Tokenizers 0.15.0
- Downloads last month
- 70
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for artyomboyko/whisper-small-ru-v2
Base model
openai/whisper-smallDataset used to train artyomboyko/whisper-small-ru-v2
Space using artyomboyko/whisper-small-ru-v2 1
Evaluation results
- Test WER on Common Voice 15self-reported12.675
- Test CER on Common Voice 15self-reported3.731