patrickvonplaten
commited on
Commit
Β·
8768b23
1
Parent(s):
ded564d
Update README.md
Browse files
README.md
CHANGED
@@ -23,7 +23,7 @@ model-index:
|
|
23 |
metrics:
|
24 |
- name: Test WER
|
25 |
type: wer
|
26 |
-
value:
|
27 |
---
|
28 |
# Wav2Vec2-Large-XLSR-53-Japanese
|
29 |
Fine-tuned [facebook/wav2vec2-large-xlsr-53](https://huggingface.co/facebook/wav2vec2-large-xlsr-53) on Japanese using the [Common Voice](https://huggingface.co/datasets/common_voice) and Japanese speech corpus of Saruwatari-lab, University of Tokyo [JSUT](https://sites.google.com/site/shinnosuketakamichi/publication/jsut).
|
@@ -41,7 +41,7 @@ import re
|
|
41 |
|
42 |
# config
|
43 |
wakati = MeCab.Tagger("-Owakati")
|
44 |
-
chars_to_ignore_regex = '[
|
45 |
|
46 |
# load data, processor and model
|
47 |
test_dataset = load_dataset("common_voice", "ja", split="test[:2%]")
|
@@ -81,7 +81,7 @@ import re
|
|
81 |
|
82 |
#config
|
83 |
wakati = MeCab.Tagger("-Owakati")
|
84 |
-
chars_to_ignore_regex = '[
|
85 |
|
86 |
# load data, processor and model
|
87 |
test_dataset = load_dataset("common_voice", "ja", split="test")
|
@@ -111,7 +111,7 @@ def evaluate(batch):
|
|
111 |
result = test_dataset.map(evaluate, batched=True, batch_size=8)
|
112 |
print("WER: {:2f}".format(100 * wer.compute(predictions=result["pred_strings"], references=result["sentence"])))
|
113 |
```
|
114 |
-
**Test Result**:
|
115 |
## Training
|
116 |
The Common Voice `train`, `validation` datasets and Japanese speech corpus `basic5000` datasets were used for training.
|
117 |
The script used for training can be found [here](https://colab.research.google.com/drive/1ZTxoYzgOotUjcyoBf0m8gZj5Kcmu2yGU)
|
|
|
23 |
metrics:
|
24 |
- name: Test WER
|
25 |
type: wer
|
26 |
+
value: 31.07
|
27 |
---
|
28 |
# Wav2Vec2-Large-XLSR-53-Japanese
|
29 |
Fine-tuned [facebook/wav2vec2-large-xlsr-53](https://huggingface.co/facebook/wav2vec2-large-xlsr-53) on Japanese using the [Common Voice](https://huggingface.co/datasets/common_voice) and Japanese speech corpus of Saruwatari-lab, University of Tokyo [JSUT](https://sites.google.com/site/shinnosuketakamichi/publication/jsut).
|
|
|
41 |
|
42 |
# config
|
43 |
wakati = MeCab.Tagger("-Owakati")
|
44 |
+
chars_to_ignore_regex = '[\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\,\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\γ\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\γ\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\οΌ\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\γ\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\γ\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\β¦\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\οΌ\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\γ»]'
|
45 |
|
46 |
# load data, processor and model
|
47 |
test_dataset = load_dataset("common_voice", "ja", split="test[:2%]")
|
|
|
81 |
|
82 |
#config
|
83 |
wakati = MeCab.Tagger("-Owakati")
|
84 |
+
chars_to_ignore_regex = '[\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\,\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\γ\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\γ\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\οΌ\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\γ\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\γ\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\β¦\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\οΌ\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\γ»]'
|
85 |
|
86 |
# load data, processor and model
|
87 |
test_dataset = load_dataset("common_voice", "ja", split="test")
|
|
|
111 |
result = test_dataset.map(evaluate, batched=True, batch_size=8)
|
112 |
print("WER: {:2f}".format(100 * wer.compute(predictions=result["pred_strings"], references=result["sentence"])))
|
113 |
```
|
114 |
+
**Test Result**: 31.07%
|
115 |
## Training
|
116 |
The Common Voice `train`, `validation` datasets and Japanese speech corpus `basic5000` datasets were used for training.
|
117 |
The script used for training can be found [here](https://colab.research.google.com/drive/1ZTxoYzgOotUjcyoBf0m8gZj5Kcmu2yGU)
|