Automatic Speech Recognition
Transformers
Safetensors
whisper
Inference Endpoints
File size: 662 Bytes
a746b1f
 
2fd21e1
 
 
 
 
 
 
 
a746b1f
 
2d26290
0dea817
f3e76b8
 
2d26290
 
 
 
 
 
 
0dea817
2d26290
 
0dea817
97d206e
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
---
library_name: transformers
license: mit
datasets:
- AlienKevin/mixed_cantonese_and_english_speech
- mozilla-foundation/common_voice_17_0
metrics:
- cer
base_model:
- openai/whisper-small
---

CER: 15.4% <br>

transformers-4.46.3

Train Args: <br>
per_device_train_batch_size=16, <br>
gradient_accumulation_steps=1,  <br>
learning_rate=1e-5, <br>
gradient_checkpointing=True, <br>
per_device_eval_batch_size=64, <br>
generation_max_length=225, <br>

Hardware: <br>
NVIDIA Tesla V100 16GB * 4 <br>

FAQ:
1. If having tokenizer issue during inference, please update your transformers version to >= 4.46.3

```bash
pip install --upgrade transformers==4.46.3
```