|
--- |
|
language: ja |
|
datasets: |
|
- common_voice |
|
metrics: |
|
- cer |
|
model-index: |
|
- name: wav2vec2-xls-r-300m finetuned on Japanese Hiragana with no word boundaries |
|
results: |
|
- task: |
|
name: Speech Recognition |
|
type: automatic-speech-recognition |
|
dataset: |
|
name: Common Voice Japanese |
|
type: common_voice |
|
args: ja |
|
metrics: |
|
- name: Test CER |
|
type: cer |
|
value: 9.34 |
|
--- |
|
# Wav2Vec2-XLS-R-300M-Japanese-Hiragana |
|
Fine-tuned [facebook/wav2vec2-xls-r-300m](https://huggingface.co/facebook/wav2vec2-xls-r-300m) on Japanese Hiragana characters using JSUT, JVS, Common Voice, and in-house dataset. |
|
The sentence outputs do not contain word boundaries. Audio inputs should be sampled at 16kHz. |
|
|
|
## Test Results |
|
**CER:** 9.34% |
|
## Training |
|
Trained on JSUT, a subset of JVS, train+valid set of Common Voice Japanese, and in-house Japanese dataset. Tested on test set of Common Voice Japanese. |