Eempostor
/

F5-TTS-IND-FINETUNE

Model card Files Files and versions Community

F5-TTS-IND-FINETUNE / README.md

Eempostor's picture

Create README.md

674e4bd verified 2 months ago

|

history blame contribute delete

1.21 kB

	---
	license: cc-by-nc-4.0
	language:
	- id
	base_model:
	- SWivid/F5-TTS
	pipeline_tag: text-to-speech
	---

	## Overview
	This indonesian finetune of [F5-TTS](https://github.com/SWivid/F5-TTS) is made to introduce indonesian speech capabilities on the model.

	## Dataset
	Length: 43.35 hours \
	Audio samples: 43999

	Dataset sources: \
	• [data-indsp-news-lvcsr](https://github.com/s-sakti/data_indsp_news_lvcsr)

	## Results
	The model has some difficulties in accurately matching the zero shot voice and emotions. The model also hallucinates on long texts.

	Reference text: "Tidak ada yang menakutiku, bahkan kematian sekalipun." \
	Reference audio: [Zilong.ogg](https://huggingface.co/Eempostor/F5-TTS-IND-FINETUNE/resolve/main/Zilong.ogg?download=true) \
	Input text: "Halo. Model faintun ini adalah sebuah percobaan. Masih terdapat beberapa kekurangan jadi tolong dimaklumkan." \
	Generated audio: [Zilong_generated.ogg](https://huggingface.co/Eempostor/F5-TTS-IND-FINETUNE/resolve/main/Zilong_generated.wav?download=true)

	## License
	The pre-trained models are licensed under the CC-BY-NC license due to the training data Emilia, which is an in-the-wild dataset. Sorry for any inconvenience this may cause.

	---