Abhinay45 commited on
Commit
23e632e
·
verified ·
1 Parent(s): fa2ad24

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +86 -3
README.md CHANGED
@@ -1,3 +1,86 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: other
3
+ license_name: coqui-public-model-license
4
+ license_link: https://coqui.ai/cpml
5
+ library_name: coqui
6
+ pipeline_tag: text-to-speech
7
+ widget:
8
+ - text: "Once when I was six years old I saw a magnificent picture"
9
+ ---
10
+
11
+ # XTTS v2 Fine-Tuned on Hindi Datasets
12
+
13
+ **Model Name**: XTTS v2 Fine-Tuned on Hindi Datasets
14
+
15
+ **Model Description**: This is a fine-tuned version of the XTTS v2 (Cross-lingual Text-to-Speech) model developed by Coqui-AI, specifically fine-tuned on Hindi speech datasets to improve performance in generating natural and accurate Hindi speech. The model supports a range of features including voice cloning and multilingual speech generation.
16
+
17
+ ### Features
18
+ - **Languages**: Supports 17 languages including Hindi (hi).
19
+ - **Voice Cloning**: Clone voices with just a 6-second audio clip.
20
+ - **Emotion and Style Transfer**: Achieve emotion and style transfer by cloning.
21
+ - **Cross-Language Voice Cloning**: Supports voice cloning across different languages.
22
+ - **Sampling Rate**: 24kHz sampling rate for high-quality audio.
23
+
24
+ ### Updates over XTTS-v1
25
+ - **New Languages**: Added support for Hungarian and Korean.
26
+ - **Architectural Improvements**: Enhanced speaker conditioning and interpolation.
27
+ - **Stability Improvements**: Better overall stability and performance.
28
+ - **Audio Quality**: Improved prosody and audio quality.
29
+
30
+ ### Languages
31
+ The XTTS-v2 model supports 17 languages including:
32
+ - **English (en)**
33
+ - **Spanish (es)**
34
+ - **French (fr)**
35
+ - **German (de)**
36
+ - **Italian (it)**
37
+ - **Portuguese (pt)**
38
+ - **Polish (pl)**
39
+ - **Turkish (tr)**
40
+ - **Russian (ru)**
41
+ - **Dutch (nl)**
42
+ - **Czech (cs)**
43
+ - **Arabic (ar)**
44
+ - **Chinese (zh-cn)**
45
+ - **Japanese (ja)**
46
+ - **Hungarian (hu)**
47
+ - **Korean (ko)**
48
+ - **Hindi (hi)**
49
+
50
+ ### Training Data
51
+ The model was fine-tuned on the following Hindi datasets:
52
+ - **Mozilla CommonVoice 18**: A diverse dataset of Hindi speech.
53
+ - **IndicTTS Hindi Dataset**: Hindi speech data for text-to-speech training.
54
+
55
+ ### Code
56
+ The [code-base](https://github.com/coqui-ai/TTS) supports both inference and [fine-tuning](https://tts.readthedocs.io/en/latest/models/xtts.html#training).
57
+
58
+ ### Demo Spaces
59
+ - [XTTS Space](https://huggingface.co/spaces/coqui/xtts): Explore the model's performance on supported languages and try it with your own reference or microphone input.
60
+ - [XTTS Voice Chat with Mistral or Zephyr](https://huggingface.co/spaces/coqui/voice-chat-with-mistral): Experience streaming voice chat with Mistral 7B Instruct or Zephyr 7B Beta.
61
+
62
+ ### License
63
+ This model is licensed under the [Coqui Public Model License](https://coqui.ai/cpml). Read more about the [origin story of CPML here](https://coqui.ai/blog/tts/cpml).
64
+
65
+ ### Contact
66
+ Join our 🐸 Community on [Discord](https://discord.gg/fBC58unbKE) and follow us on [Twitter](https://twitter.com/coqui_ai). For inquiries, you can also email us at [email protected].
67
+
68
+ ### Colab Notebook
69
+ You can view the Colab notebook used for fine-tuning the XTTS v2 model on Hindi datasets and replicate the process by following this [Colab Notebook Link](https://colab.research.google.com/drive/1VwNltFIcqhB7Ydt4NVaPnYegl-qHoUSO#scrollTo=KKj-kq7iCG3d).
70
+
71
+ ### Usage
72
+
73
+ #### Using 🐸TTS API
74
+ ```python
75
+ from TTS.api import TTS
76
+
77
+ # Load the model
78
+ tts = TTS("tts_models/multilingual/multi-dataset/xtts_v2", gpu=True)
79
+
80
+ # Generate speech by cloning a voice using default settings
81
+ tts.tts_to_file(
82
+ text="It took me quite a long time to develop a voice, and now that I have it I'm not going to be silent.",
83
+ file_path="output.wav",
84
+ speaker_wav="/path/to/target/speaker.wav",
85
+ language="hi"
86
+ )