Would there be confusion between CER and WER on test metrics ?
Hello!
In view of the results which are very impressive, I am a little surprised that the best wav2vec2 reference model trained 6 months ago on the CV7.0 has results comparable to its CER for your WER. is it the WER or the CER on your test? I don't have time to check it myself.
Have a nice day !
see by yourself with these 2 links
https://paperswithcode.com/sota/speech-recognition-on-common-voice-7-0-german-1
https://paperswithcode.com/sota/speech-recognition-on-common-voice-7-0-german?metric=Test%20WER
The links you provided are for German, though this is the French model card. I assume you are asking the question for french - https://paperswithcode.com/sota/automatic-speech-recognition-on-mcv-7-0
Yes, the results calculated here are WER, not CER. We normally do not publish CER scores for languages where WER can be computed.
There are a few reasons for this -
- This is a Conformer Transducer - Transducer models are much more accurate than CTC models in general. Conformer CTC is also more accurate than Wav2Vec CTC in nearly all cases.
- These models are jointly trained - note that they train via both MCV + MLS French, so it is expected that their overall score on MCV alone is superior to a model that was trained on just MCV.