Update README.md

cd84f6b verified 3 months ago

3.57 kB

	---
	license: other
	license_name: coqui-public-model-license
	license_link: https://coqui.ai/cpml
	library_name: coqui
	pipeline_tag: text-to-speech
	widget:
	- text: "Once when I was six years old I saw a magnificent picture"
	---

	# XTTS v2 Fine-Tuned on Hindi Datasets

	Model Name: XTTS v2 Fine-Tuned on Hindi Datasets

	Model Description: This is a fine-tuned version of the XTTS v2 (Cross-lingual Text-to-Speech) model developed by Coqui-AI, specifically fine-tuned on Hindi speech datasets to improve performance in generating natural and accurate Hindi speech. The model supports a range of features including voice cloning and multilingual speech generation.

	### Colab Notebook
	You can view the Colab notebook used for fine-tuning the XTTS v2 model on Hindi datasets and replicate the process by following this [Colab Notebook Link](https://colab.research.google.com/drive/1VwNltFIcqhB7Ydt4NVaPnYegl-qHoUSO#scrollTo=KKj-kq7iCG3d).

	### Features
	- Languages: Supports 16 languages including Hindi (hi).
	- Voice Cloning: Clone voices with just a 6-second audio clip.
	- Emotion and Style Transfer: Achieve emotion and style transfer by cloning.
	- Cross-Language Voice Cloning: Supports voice cloning across different languages.
	- Sampling Rate: 24kHz sampling rate for high-quality audio.

	### Updates over XTTS-v1
	- New Languages: Added support for Hungarian and Korean.
	- Architectural Improvements: Enhanced speaker conditioning and interpolation.
	- Stability Improvements: Better overall stability and performance.
	- Audio Quality: Improved prosody and audio quality.

	### Languages
	The XTTS-v2 model supports 17 languages including:
	- English (en)
	- Spanish (es)
	- French (fr)
	- German (de)
	- Italian (it)
	- Portuguese (pt)
	- Polish (pl)
	- Turkish (tr)
	- Russian (ru)
	- Dutch (nl)
	- Czech (cs)
	- Arabic (ar)
	- Chinese (zh-cn)
	- Japanese (ja)
	- Hungarian (hu)
	- Korean (ko)
	- Hindi (hi)

	### Training Data
	The model was fine-tuned on the following Hindi datasets:
	- Mozilla CommonVoice 18: A diverse dataset of Hindi speech.
	- IndicTTS Hindi Dataset: Hindi speech data for text-to-speech training.

	### Code
	The [code-base](https://github.com/coqui-ai/TTS) supports both inference and [fine-tuning](https://tts.readthedocs.io/en/latest/models/xtts.html#training).

	### Demo Spaces
	- [XTTS Space](https://huggingface.co/spaces/coqui/xtts): Explore the model's performance on supported languages and try it with your own reference or microphone input.
	- [XTTS Voice Chat with Mistral or Zephyr](https://huggingface.co/spaces/coqui/voice-chat-with-mistral): Experience streaming voice chat with Mistral 7B Instruct or Zephyr 7B Beta.

	### License
	This model is licensed under the [Coqui Public Model License](https://coqui.ai/cpml). Read more about the [origin story of CPML here](https://coqui.ai/blog/tts/cpml).

	### Contact
	Join our 🐸 Community on [Discord](https://discord.gg/fBC58unbKE) and follow us on [Twitter](https://twitter.com/coqui_ai). For inquiries, you can also email us at [email protected].

	### Usage

	#### Using 🐸TTS API
	```python
	from TTS.api import TTS

	# Load the model
	tts = TTS("tts_models/multilingual/multi-dataset/xtts_v2", gpu=True)

	# Generate speech by cloning a voice using default settings
	tts.tts_to_file(
	text="It took me quite a long time to develop a voice, and now that I have it I'm not going to be silent.",
	file_path="output.wav",
	speaker_wav="/path/to/target/speaker.wav",
	language="hi"
	)