Overview

This indonesian finetune of F5-TTS is made to introduce indonesian speech capabilities on the model.

Dataset

Length: 43.35 hours
Audio samples: 43999

Dataset sources:
• data-indsp-news-lvcsr

Results

The model has some difficulties in accurately matching the zero shot voice and emotions. The model also hallucinates on long texts.

Reference text: "Tidak ada yang menakutiku, bahkan kematian sekalipun."
Reference audio: Zilong.ogg
Input text: "Halo. Model faintun ini adalah sebuah percobaan. Masih terdapat beberapa kekurangan jadi tolong dimaklumkan."
Generated audio: Zilong_generated.ogg

License

The pre-trained models are licensed under the CC-BY-NC license due to the training data Emilia, which is an in-the-wild dataset. Sorry for any inconvenience this may cause.


Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Examples
Unable to determine this model's library. Check the docs .

Model tree for Eempostor/F5-TTS-IND-FINETUNE

Base model

SWivid/F5-TTS
Finetuned
(23)
this model