vumichien
/

wav2vec2-xls-r-1b-japanese

@@ -1,38 +1,76 @@
 ---
 language:
 - ja
-license: apache-2.0
 tags:
 - automatic-speech-recognition
-- vumichien/common_voice_large_jsut_jsss_css10
-- generated_from_trainer
 model-index:
-- name: wav2vec2-xls-r-1b-ja-dumy8
-  results: []
 ---
-<!-- This model card has been generated automatically according to the information the Trainer had access to. You
-should probably proofread and complete it, then remove this comment. -->
-# wav2vec2-xls-r-1b-ja-dumy8
-This model is a fine-tuned version of [facebook/wav2vec2-xls-r-1b](https://huggingface.co/facebook/wav2vec2-xls-r-1b) on the VUMICHIEN/COMMON_VOICE_LARGE_JSUT_JSSS_CSS10 - JA dataset.
-It achieves the following results on the evaluation set:
-- Loss: 0.2104
-- Wer: 0.1941
-- Cer: 0.0991
 ## Model description
-More information needed
-## Intended uses & limitations
-More information needed
-## Training and evaluation data
-More information needed
 ## Training procedure

 ---
+license: apache-2.0
 language:
 - ja
 tags:
 - automatic-speech-recognition
+- robust-speech-event
+- common-voice
+- ja
 model-index:
+- name: wav2vec2-xls-r-1b
+  results:
+  - task:
+      name: Speech Recognition
+      type: automatic-speech-recognition
+    dataset:
+      name: Common Voice 7.0
+      type: mozilla-foundation/common_voice_8_0
+      args: ja
+    metrics:
+       - name: Test WER (with LM)
+         type: wer
+         value: 7.98
+       - name: Test CER (with LM)
+         type: cer
+         value: 3.42
+  - task:
+      name: Speech Recognition
+      type: automatic-speech-recognition
+    dataset:
+      name: Common Voice 8.0
+      type: mozilla-foundation/common_voice_8_0
+      args: ja
+    metrics:
+       - name: Test WER (with LM)
+         type: wer
+         value: 7.88
+       - name: Test CER (with LM)
+         type: cer
+         value: 3.35
+  - task:
+      name: Speech Recognition
+      type: automatic-speech-recognition
+    dataset:
+      name: Robust Speech Event - Dev Data
+      type: speech-recognition-community-v2/dev_data
+      args: ja
+    metrics:
+       - name: Test WER (with LM)
+         type: wer
+         value: 28.07
+       - name: Test CER (with LM)
+         type: cer
+         value: 16.27
 ---
 ## Model description
+This model is a fine-tuned version of [facebook/wav2vec2-xls-r-1b](https://huggingface.co/facebook/wav2vec2-xls-r-1b) on my collection of Public Japanese Voice dataset for research VUMICHIEN/COMMON_VOICE_LARGE_JSUT_JSSS_CSS10.
+### Benchmark WER result:
+| | [COMMON VOICE 7.0](https://huggingface.co/datasets/mozilla-foundation/common_voice_7_0) | [COMMON VOICE 8.0](https://huggingface.co/datasets/mozilla-foundation/common_voice_8_0)
+|---|---|---|
+|without LM| 10.96 | 10.91 |
+|with 4-grams LM| 7.98 | 7.88 |
+### Benchmark CER result:
+| | [COMMON VOICE 7.0](https://huggingface.co/datasets/mozilla-foundation/common_voice_7_0) | [COMMON VOICE 8.0](https://huggingface.co/datasets/mozilla-foundation/common_voice_8_0)
+|---|---|---|
+|without LM| 4.28 | 4.22 |
+|with 4-grams LM| 3.42 | 3.35 |
+## Evaluation
+Please use the eval.py file to run the evaluation:
+```python
+python eval.py --model_id vutankiet2901/wav2vec2-large-xlsr-53-ja --dataset mozilla-foundation/common_voice_7_0 --config ja --split test --log_outputs
+```
 ## Training procedure