Eempostor commited on
Commit
674e4bd
·
verified ·
1 Parent(s): cf4b862

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +31 -0
README.md ADDED
@@ -0,0 +1,31 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: cc-by-nc-4.0
3
+ language:
4
+ - id
5
+ base_model:
6
+ - SWivid/F5-TTS
7
+ pipeline_tag: text-to-speech
8
+ ---
9
+
10
+ ## Overview
11
+ This indonesian finetune of [F5-TTS](https://github.com/SWivid/F5-TTS) is made to introduce indonesian speech capabilities on the model.
12
+
13
+ ## Dataset
14
+ Length: 43.35 hours \
15
+ Audio samples: 43999
16
+
17
+ Dataset sources: \
18
+ • [data-indsp-news-lvcsr](https://github.com/s-sakti/data_indsp_news_lvcsr)
19
+
20
+ ## Results
21
+ The model has some difficulties in accurately matching the zero shot voice and emotions. The model also hallucinates on long texts.
22
+
23
+ Reference text: "Tidak ada yang menakutiku, bahkan kematian sekalipun." \
24
+ Reference audio: [Zilong.ogg](https://huggingface.co/Eempostor/F5-TTS-IND-FINETUNE/resolve/main/Zilong.ogg?download=true) \
25
+ Input text: "Halo. Model faintun ini adalah sebuah percobaan. Masih terdapat beberapa kekurangan jadi tolong dimaklumkan." \
26
+ Generated audio: [Zilong_generated.ogg](https://huggingface.co/Eempostor/F5-TTS-IND-FINETUNE/resolve/main/Zilong_generated.wav?download=true)
27
+
28
+ ## License
29
+ The pre-trained models are licensed under the CC-BY-NC license due to the training data Emilia, which is an in-the-wild dataset. Sorry for any inconvenience this may cause.
30
+
31
+ ---