Eempostor
/

F5-TTS-IND-FINETUNE

Model card Files Files and versions Community

Eempostor commited on Nov 25, 2024

Commit

674e4bd

·

verified ·

1 Parent(s): cf4b862

Create README.md

Files changed (1) hide show

README.md +31 -0

README.md ADDED Viewed

	@@ -0,0 +1,31 @@

+---
+license: cc-by-nc-4.0
+language:
+- id
+base_model:
+- SWivid/F5-TTS
+pipeline_tag: text-to-speech
+---
+## Overview
+This indonesian finetune of [F5-TTS](https://github.com/SWivid/F5-TTS) is made to introduce indonesian speech capabilities on the model.
+## Dataset
+Length: 43.35 hours \
+Audio samples: 43999
+Dataset sources: \
+• [data-indsp-news-lvcsr](https://github.com/s-sakti/data_indsp_news_lvcsr)
+## Results
+The model has some difficulties in accurately matching the zero shot voice and emotions. The model also hallucinates on long texts.
+Reference text: "Tidak ada yang menakutiku, bahkan kematian sekalipun." \
+Reference audio: [Zilong.ogg](https://huggingface.co/Eempostor/F5-TTS-IND-FINETUNE/resolve/main/Zilong.ogg?download=true) \
+Input text: "Halo. Model faintun ini adalah sebuah percobaan. Masih terdapat beberapa kekurangan jadi tolong dimaklumkan." \
+Generated audio: [Zilong_generated.ogg](https://huggingface.co/Eempostor/F5-TTS-IND-FINETUNE/resolve/main/Zilong_generated.wav?download=true)
+## License
+The pre-trained models are licensed under the CC-BY-NC license due to the training data Emilia, which is an in-the-wild dataset. Sorry for any inconvenience this may cause.
+---