AbhishekTiwariAKT commited on
Commit
ad2b94a
·
verified ·
1 Parent(s): cee2aea

Upload /README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +73 -0
README.md ADDED
@@ -0,0 +1,73 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # F5-TTS Model Inference Guide
2
+
3
+ Welcome! This guide will walk you through the steps to load and run the **F5-TTS** model for text-to-speech synthesis using reference audio and text inputs.
4
+
5
+ ---
6
+
7
+ ### Did You Know?
8
+ *Text-to-speech models like F5-TTS can mimic voice characteristics by analyzing just a few seconds of audio input. This adaptability is paving the way for personalized, AI-driven audio content.*
9
+
10
+ ---
11
+
12
+ ## Steps to Run the F5-TTS Model
13
+
14
+ ### 1. Clone the Repository
15
+ Start by cloning the F5-TTS repository to your local environment:
16
+
17
+ ```bash
18
+ git clone https://github.com/SWivid/F5-TTS.git
19
+ cd F5-TTS
20
+ ```
21
+
22
+ ### 2. Install CUDA
23
+
24
+ #### Install an appropriate CUDA version compatible with your PyTorch and TorchAudio versions to enable GPU support.
25
+ ```bash
26
+ pip install torch==2.3.0+cu118 --extra-index-url https://download.pytorch.org/whl/cu118
27
+ pip install torchaudio==2.3.0+cu118 --extra-index-url https://download.pytorch.org/whl/cu118
28
+ ```
29
+
30
+ ### 3. Install Required Python Packages
31
+ Install the required dependencies specified in the requirements.txt file to set up your environment:
32
+
33
+ ```bash
34
+ pip install -r requirements.txt
35
+ ```
36
+
37
+ ### 4. System Setup: APT Update, FFmpeg, and CUDA
38
+ Before running inference, ensure your system has the necessary dependencies:
39
+
40
+ Update APT Packages and Install FFmpeg
41
+
42
+ FFmpeg is essential for audio processing tasks. Update your APT packages and install ffmpeg with the following commands:
43
+
44
+ ```bash
45
+ apt update
46
+ apt install -y ffmpeg
47
+ ```
48
+
49
+
50
+ ### 5. Run Inference with the F5-TTS Model
51
+ With the environment ready, you can now run the inference script. Adjust the paths as needed:
52
+
53
+ ```bash
54
+ python inference-cli.py \
55
+
56
+ # Specify the model name to use for inference
57
+ --model "F5-TTS" \
58
+
59
+ # Path to the model checkpoint file, which contains the saved model weights
60
+ --ckpt_file "path/to/model.pt" \
61
+
62
+ # Path to the reference audio file. This file is used to capture the speaking style
63
+ # and voice characteristics, which the model will try to mimic.
64
+ --ref_audio "wavs/sample_audio.wav" \
65
+
66
+ # Reference text associated with the reference audio file.
67
+ # This helps the model understand the speaking style.
68
+ --ref_text "levantara a mão contra ele e o oficial então arrancara da espada e atravessara o de lado a lado estava direito ah" \
69
+
70
+ # Text that the model will generate speech for. This will be spoken in the style
71
+ # derived from the reference audio and text.
72
+ --gen_text "O Brasil, oficialmente República Federativa do Brasil, é o maior país da América do Sul e da América Latina."
73
+ ```