Spaces:

TTS-AGI
/

TTS-Arena

Running on CPU Upgrade

App Files Files Community

New Open Source TTS Model (OuteTTS)

#66

by ecyht2 - opened Nov 11, 2024

Discussion

ecyht2

Nov 11, 2024

A new open source TTS model got released under CC-BY 4.0 License.

OuteAI/OuteTTS-0.1-350M

They also have GGUF version as well.

Here is the demo.

Pendrokar

Nov 11, 2024

the model may frequently alter, insert, or omit wrong words, leading to variations in output quality.

IMHO, this makes it an invalid TTS.

ecyht2

Nov 13, 2024

the model may frequently alter, insert, or omit wrong words, leading to variations in output quality.

Yeah, I agree but I just want to document open-source TTS models.

ecyht2

Dec 2, 2024

I found a working space ameerazam08/OuteTTS-0.2-500M-Demo, it has to have ZeroGPU otherwise it will be very slow.

ecyht2

Dec 2, 2024

I also notice that you have to provide a voice sample that is an English speaker. Otherwise it will sound weird especially when pronouncing numbers. For example when you have a Chinese voice (the provided voices) the model will say the word 10 or ten as the same word but in Chinese.

Pendrokar

3 days ago

the model may frequently alter, insert, or omit wrong words, leading to variations in output quality.

IMHO, this makes it an invalid TTS.

Still. I went ahed and added the v0.2 500M and v0.3 1B TTS spaces to the forked Arena Space.
https://huggingface.co/spaces/Pendrokar/TTS-Spaces-Arena

It is probably not fairing very well against others due to poor audio quality of the default voice and early cutoff of the audio sample. Only then do people care about the delivery. This was similar with WhisperSpeech, whose line delivery was great overall.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment