ales
/

wav2vec2-cv-be

Automatic Speech Recognition

Inference Endpoints

Model card Files Files and versions Community

ales commited on Apr 13, 2022

Commit

be307cc

·

1 Parent(s): b530a92

Update README.md

Files changed (1) hide show

README.md +52 -0

README.md CHANGED Viewed

@@ -1,3 +1,55 @@
 ---
 license: gpl-3.0
 ---

 ---
 license: gpl-3.0
+language:
+- be
+tags:
+- audio
+- speech
+- automatic-speech-recognition
+datasets:
+- mozilla-foundation/common_voice_8_0
+metrics:
+- wer
+model-index:
+- name: wav2vec2
+  results:
+  - task:
+      name: Automatic Speech Recognition
+      type: automatic-speech-recognition
+    dataset:
+      name: Common Voice 8
+      type: mozilla-foundation/common_voice_8_0
+      args: be
+    metrics:
+    - name: Dev WER
+      type: wer
+      value: 17.61
+    - name: Test WER
+      type: wer
+      value: 18.7
+    - name: Dev WER (with LM)
+      type: wer
+      value: 11.5
+    - name: Test WER (with LM)
+      type: wer
+      value: 12.4
 ---
+# Automatic Speech Recognition for Belarusian language
+Fine-tuned version of [facebook/wav2vec2-base](https://huggingface.co/facebook/wav2vec2-base) on `mozilla-foundation/common_voice_8_0 be` dataset.
+`Train`, `Dev`, `Test` splits were used as they are present in the dataset. No additional data was used from `Validated` split,
+only 1 voicing of each sentence were used - the way the data was split by CommonVoice.
+To build a better model one can use additional voicings from `Validated` split for sentences already present in `Train`, `Dev`, `Test` splits,
+i.e. enlarge mentioned split.
+Language model was built using [KenLM](https://kheafield.com/code/kenlm/estimation/).
+5-gram Language model was build on sentences from `Train` + (`Other` - `Dev` - `Test`) splits of `mozilla-foundation/common_voice_8_0 be` dataset.
+Source code is available [here](https://github.com/yks72p/stt_be).