File size: 4,481 Bytes
6bf0fa9 bdb4f02 019e2a3 8c00071 019e2a3 88cfb3e 8331c06 4345759 5067878 6bf0fa9 5d9a91a 88cfb3e ec9d730 f33f084 5d9a91a 229707e f44507b 229707e 5d9a91a 70184a3 5d9a91a ec9d730 9d6172b 5d9a91a 9146509 c86304f 7d11d0a 23acb9e 5067878 70184a3 5067878 83098ea a212d92 83098ea c8dd13e 8a23304 7fa53df 8a23304 7fa53df 5067878 c8dd13e 7abc8f8 c8dd13e 794578f eb2f44b c8dd13e 794578f f44507b c8dd13e 267f0b7 eb2f44b f44507b 9d6172b 972caea 7fa53df 7184f5f 972caea 23acb9e 2a2d5c1 23acb9e 7184f5f 4377106 2639eaf 794578f 4377106 6ab4672 4377106 dd13de0 2a2d5c1 081324f 4377106 c4effd2 25b87f7 c4effd2 5ffcd95 c4effd2 4377106 c4effd2 5067878 25b87f7 5067878 459d7a3 5ffcd95 25b87f7 5ffcd95 b71acb9 25b87f7 5b7599e 5ffcd95 267f0b7 5ffcd95 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 |
---
license: cc-by-nc-sa-4.0
language:
- en
pipeline_tag: text-to-audio
tags:
- audiocraft
- audiogen
- styletts2
- shift-tts
- sound
- audio-generation
- text-to-speech
- mimic3
---
SHIFT TTS - StyleTTS2 with Sythetic Speakers (made by another TTS)
[![Beta Text 2 Speech Tool](assets/shift_banner.png?raw=true)](https://shift-europe.eu/)
##
# SHIFT TTS / AudioGen
Beta version of [SHIFT](https://shift-europe.eu/) TTS tool with [AudioGen soundscapes](https://huggingface.co/dkounadis/artificial-styletts2/discussions/3)
- [Analysis of emotion of SHIFT TTS](https://huggingface.co/dkounadis/artificial-styletts2/discussions/2)
- [Listen Also foreign languages](https://huggingface.co/dkounadis/artificial-styletts2/discussions/4) synthesized via [MMS TTS](https://huggingface.co/facebook/mms-tts)
## Listen Voices
<a href="https://huggingface.co/dkounadis/artificial-styletts2/discussions/1#67854dcbd3e6beb1a78f7f20">Native English</a> / <a href="https://huggingface.co/dkounadis/artificial-styletts2/discussions/1#6783e3b00e7d90facec060c6">Non-native English: Accents</a> / <a href="https://huggingface.co/dkounadis/artificial-styletts2/blob/main/Utils/all_langs.csv">Foreign languages</a>
##
[TTS Demo](https://huggingface.co/dkounadis/artificial-styletts2/blob/main/demo.py)
## Flask API
<details>
<summary>
Build virtualenv & run api.py
</summary>
Above [TTS Demo](https://huggingface.co/dkounadis/artificial-styletts2/blob/main/demo.py) is a standalone script that loads SHIFT TTS & AudioGen model(s) and synthesizes a txt. We also provide a Flask `api.py` that allows faster inference with
loading only once the TTS & AudioGen model.
Clone
```
git clone https://huggingface.co/dkounadis/artificial-styletts2
```
Install
```
virtualenv --python=python3 ~/.envs/.my_env
source ~/.envs/.my_env/bin/activate
cd artificial-styletts2/
pip install -r requirements.txt
```
Flask `tmux-session`
```
CUDA_DEVICE_ORDER=PCI_BUS_ID HF_HOME=./hf_home CUDA_VISIBLE_DEVICES=0 python api.py
```
Following examples need `api.py` to be running. [Set this IP](https://huggingface.co/dkounadis/artificial-styletts2/blob/main/tts.py#L85) to the IP shown when starting `api.py`.
</details>
## Landscape 2 Soundscapes
The following needs `api.py` to be already running on a tmux session.
```python
# TTS & soundscape - overlay to .mp4
python landscape2soundscape.py
```
For SHIFT demo / Collaboration with [SMB](https://www.smb.museum/home/)
- YouTube Videos
[![01](uc_spk_Landscape2Soundscape_Masterpieces_pics/thumb____01_Schick_AII840_001.jpg)](https://youtu.be/SSi3gUO4GtY)
[![02](uc_spk_Landscape2Soundscape_Masterpieces_pics/thumb____02_Constable_AI555_001.jpg)](https://youtu.be/2YjxAPkdXIc)
[![03](uc_spk_Landscape2Soundscape_Masterpieces_pics/thumb____03_Schinkel_WS200-002.jpg)](https://youtu.be/BhMh02knkco)
[![05](uc_spk_Landscape2Soundscape_Masterpieces_pics/thumb____05_Blechen_FV40_001.jpg)](https://youtu.be/a3qk9S87v60)
[![06](uc_spk_Landscape2Soundscape_Masterpieces_pics/thumb____06_Menzel_AI900_001.jpg)](https://youtu.be/3M0y9OYzDfU)
[![07](uc_spk_Landscape2Soundscape_Masterpieces_pics/thumb____07_Courbet_AI967_001.jpg)](https://youtu.be/OBY666_By1k)
[![08](uc_spk_Landscape2Soundscape_Masterpieces_pics/thumb____08_Monet_AI1013_001.jpg)](https://youtu.be/gnGCYLcdLsA)
[![10](uc_spk_Landscape2Soundscape_Masterpieces_pics/thumb____10_Boecklin_967648_NG2-80_001_rsz.jpg)](https://www.youtube.com/watch?v=Y8QyYUgLaCg)
[![11](uc_spk_Landscape2Soundscape_Masterpieces_pics/thumb____11_Liebermann_NG4-94_001.jpg)](https://youtu.be/XDDzxDSrhb0)
[![12](uc_spk_Landscape2Soundscape_Masterpieces_pics/thumb____12_Slevogt_AII1022_001.jpg)](https://youtu.be/I3YYKiUzHpA)
# SoundScape Live (iterative) Demo - Paplay
Special Flask API for playing sounds live
```python
CUDA_DEVICE_ORDER=PCI_BUS_ID HF_HOME=/data/dkounadis/.hf7/ CUDA_VISIBLE_DEVICES=4 python live_api.py
```
Client - Describe any sound with words and it will be played back to you.
```python
python live_demo.py # will ask text input & play soundscape
```
# SoundScape (basic) Demo
```python
CUDA_DEVICE_ORDER=PCI_BUS_ID HF_HOME=/data/dkounadis/.hf7/ CUDA_VISIBLE_DEVICES=4 python demo.py
```
##
# Audiobook
Create audiobook from `.docx`. Listen to it - YouTube [male voice](https://www.youtube.com/watch?v=5-cpf7u18JE) / [female voice](https://www.youtube.com/watch?v=pzrLYCaWD2A)
```python
# audiobook will be saved in ./tts_audiobooks
python audiobook.py
```
|