vumichien
/

wav2vec2-large-xlsr-japanese

@@ -23,7 +23,7 @@ model-index:
     metrics:
        - name: Test WER
          type: wer
-         value: 30.837004
 ---
 # Wav2Vec2-Large-XLSR-53-Japanese
 Fine-tuned [facebook/wav2vec2-large-xlsr-53](https://huggingface.co/facebook/wav2vec2-large-xlsr-53) on Japanese using the [Common Voice](https://huggingface.co/datasets/common_voice) and Japanese speech corpus of Saruwatari-lab, University of Tokyo [JSUT](https://sites.google.com/site/shinnosuketakamichi/publication/jsut).
@@ -41,7 +41,7 @@ import re
 # config
 wakati = MeCab.Tagger("-Owakati")
-chars_to_ignore_regex = '[\\\\\\\\\\\\\\\\,\\\\\\\\\\\\\\\\、\\\\\\\\\\\\\\\\。\\\\\\\\\\\\\\\\．\\\\\\\\\\\\\\\\「\\\\\\\\\\\\\\\\」\\\\\\\\\\\\\\\\…\\\\\\\\\\\\\\\\？\\\\\\\\\\\\\\\\・]'
 # load data, processor and model
 test_dataset = load_dataset("common_voice", "ja", split="test[:2%]")
@@ -81,7 +81,7 @@ import re
 #config
 wakati = MeCab.Tagger("-Owakati")
-chars_to_ignore_regex = '[\\\\\\\\\\\\\\\\,\\\\\\\\\\\\\\\\、\\\\\\\\\\\\\\\\。\\\\\\\\\\\\\\\\．\\\\\\\\\\\\\\\\「\\\\\\\\\\\\\\\\」\\\\\\\\\\\\\\\\…\\\\\\\\\\\\\\\\？\\\\\\\\\\\\\\\\・]'
 # load data, processor and model
 test_dataset = load_dataset("common_voice", "ja", split="test")
@@ -111,7 +111,7 @@ def evaluate(batch):
 result = test_dataset.map(evaluate, batched=True, batch_size=8)
 print("WER: {:2f}".format(100 * wer.compute(predictions=result["pred_strings"], references=result["sentence"])))
 ```
-**Test Result**: 30.837%
 ## Training
 The Common Voice `train`, `validation` datasets and Japanese speech corpus `basic5000` datasets were used for training.
 The script used for training can be found [here](https://colab.research.google.com/drive/1ZTxoYzgOotUjcyoBf0m8gZj5Kcmu2yGU)

     metrics:
        - name: Test WER
          type: wer
+         value: 31.07
 ---
 # Wav2Vec2-Large-XLSR-53-Japanese
 Fine-tuned [facebook/wav2vec2-large-xlsr-53](https://huggingface.co/facebook/wav2vec2-large-xlsr-53) on Japanese using the [Common Voice](https://huggingface.co/datasets/common_voice) and Japanese speech corpus of Saruwatari-lab, University of Tokyo [JSUT](https://sites.google.com/site/shinnosuketakamichi/publication/jsut).
 # config
 wakati = MeCab.Tagger("-Owakati")
+chars_to_ignore_regex = '[\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\,\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\、\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\。\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\．\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\「\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\」\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\…\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\？\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\・]'
 # load data, processor and model
 test_dataset = load_dataset("common_voice", "ja", split="test[:2%]")
 #config
 wakati = MeCab.Tagger("-Owakati")
+chars_to_ignore_regex = '[\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\,\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\、\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\。\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\．\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\「\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\」\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\…\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\？\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\・]'
 # load data, processor and model
 test_dataset = load_dataset("common_voice", "ja", split="test")
 result = test_dataset.map(evaluate, batched=True, batch_size=8)
 print("WER: {:2f}".format(100 * wer.compute(predictions=result["pred_strings"], references=result["sentence"])))
 ```
+**Test Result**: 31.07%
 ## Training
 The Common Voice `train`, `validation` datasets and Japanese speech corpus `basic5000` datasets were used for training.
 The script used for training can be found [here](https://colab.research.google.com/drive/1ZTxoYzgOotUjcyoBf0m8gZj5Kcmu2yGU)