--- license: mit language: - en tags: - audio - text-to-speech - matcha-tts --- # Matcha-TTS CommonVoice EN001 Audio https://commonvoice.mozilla.org/en/datasets I called Corps 1 of audios 42da7f26(head-audio-id)_290(files) EN001 Train with IPA(this folk) https://github.com/akjava/Matcha-TTS-Japanese ## Files ### checkpoints Matcha-TTS checkpoint - epoch seems big but train with only 290 audios ### ONNX onnx simplified ``` from onnxsim import simplify import onnx model = onnx.load("en001_6399_T2.onnx") model_simp, check = simplify(model) onnx.save(model_simp, "en001_6399_T2_simplify.onnx") ``` - T2 means Vocoder is hifigan_T2_v1 - Unif means Voder is hifigan_univ_v1 To use onnx need something,I'll add sample code later ### Audio I cut with VAD tools and denoise with resemble-enhance