ales commited on
Commit
be307cc
·
1 Parent(s): b530a92

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +52 -0
README.md CHANGED
@@ -1,3 +1,55 @@
1
  ---
2
  license: gpl-3.0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: gpl-3.0
3
+
4
+ language:
5
+ - be
6
+
7
+ tags:
8
+ - audio
9
+ - speech
10
+ - automatic-speech-recognition
11
+
12
+ datasets:
13
+ - mozilla-foundation/common_voice_8_0
14
+
15
+ metrics:
16
+ - wer
17
+
18
+ model-index:
19
+ - name: wav2vec2
20
+ results:
21
+ - task:
22
+ name: Automatic Speech Recognition
23
+ type: automatic-speech-recognition
24
+ dataset:
25
+ name: Common Voice 8
26
+ type: mozilla-foundation/common_voice_8_0
27
+ args: be
28
+ metrics:
29
+ - name: Dev WER
30
+ type: wer
31
+ value: 17.61
32
+ - name: Test WER
33
+ type: wer
34
+ value: 18.7
35
+ - name: Dev WER (with LM)
36
+ type: wer
37
+ value: 11.5
38
+ - name: Test WER (with LM)
39
+ type: wer
40
+ value: 12.4
41
  ---
42
+
43
+ # Automatic Speech Recognition for Belarusian language
44
+
45
+ Fine-tuned version of [facebook/wav2vec2-base](https://huggingface.co/facebook/wav2vec2-base) on `mozilla-foundation/common_voice_8_0 be` dataset.
46
+
47
+ `Train`, `Dev`, `Test` splits were used as they are present in the dataset. No additional data was used from `Validated` split,
48
+ only 1 voicing of each sentence were used - the way the data was split by CommonVoice.
49
+ To build a better model one can use additional voicings from `Validated` split for sentences already present in `Train`, `Dev`, `Test` splits,
50
+ i.e. enlarge mentioned split.
51
+
52
+ Language model was built using [KenLM](https://kheafield.com/code/kenlm/estimation/).
53
+ 5-gram Language model was build on sentences from `Train` + (`Other` - `Dev` - `Test`) splits of `mozilla-foundation/common_voice_8_0 be` dataset.
54
+
55
+ Source code is available [here](https://github.com/yks72p/stt_be).