File size: 4,099 Bytes
6bf0fa9
bdb4f02
019e2a3
 
8c00071
019e2a3
 
 
 
 
8331c06
4345759
5067878
 
6bf0fa9
5d9a91a
f33f084
 
 
5d9a91a
f44507b
 
eb2f44b
f39802c
eb2f44b
5d9a91a
2bf7855
5d9a91a
dafcadc
5d9a91a
2bf7855
c86304f
 
23acb9e
5067878
 
f39802c
5067878
c8dd13e
8a23304
7fa53df
 
 
 
8a23304
7fa53df
5067878
c8dd13e
 
7abc8f8
c8dd13e
794578f
 
eb2f44b
c8dd13e
794578f
f44507b
c8dd13e
 
c86304f
f44507b
 
 
eb2f44b
 
f44507b
 
 
22a403e
972caea
7fa53df
7184f5f
972caea
 
 
 
 
23acb9e
 
 
 
2a2d5c1
 
23acb9e
 
7184f5f
 
4377106
 
2639eaf
794578f
4377106
 
6ab4672
 
4377106
 
dd13de0
 
 
2a2d5c1
081324f
4377106
 
 
c4effd2
25b87f7
c4effd2
5ffcd95
c4effd2
 
 
 
 
4377106
c4effd2
 
 
5067878
 
25b87f7
5067878
 
 
459d7a3
5ffcd95
25b87f7
5ffcd95
b71acb9
25b87f7
5b7599e
5ffcd95
 
5b7599e
5ffcd95
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
---
license: cc-by-nc-sa-4.0
language:
- en
pipeline_tag: text-to-audio
tags:
- audiocraft
- audiogen
- styletts2
- shift
- sound
- audio-generation
- text-to-speech
- mimic3
---

[![SHIFT Text 2 Speech Tool](https://github.com/audeering/shift/blob/main/assets/shift_banner.png?raw=true)](https://shift-europe.eu/)

##

# Affective TTS / Soundscape

Expansion of [SHIFT TTS tool](https://github.com/audeering/shift) with [AudioGen soundscapes](https://huggingface.co/dkounadis/artificial-styletts2/discussions/3)
  - [Analysis of emotions of TTS](https://huggingface.co/dkounadis/artificial-styletts2/discussions/2)
  - [Listen also foreign languages](https://huggingface.co/dkounadis/artificial-styletts2/discussions/4)

## Available Voices

<a href="https://audeering.github.io/shift/">Native English voices!</a> / <a href="https://huggingface.co/dkounadis/artificial-styletts2/discussions/1#6783e3b00e7d90facec060c6">Non-native English accents!</a> / <a href="https://huggingface.co/dkounadis/artificial-styletts2/blob/main/Utils/all_langs.csv">Foreign languages</a>

[TTS Demo](https://huggingface.co/dkounadis/artificial-styletts2/blob/main/demo.py) save `demo.wav`

## API

<details>
<summary>
Build virtualenv & start API
</summary>

Clone

```
git clone https://huggingface.co/dkounadis/artificial-styletts2
```
Install

```
virtualenv --python=python3 ~/.envs/.my_env
source ~/.envs/.my_env/bin/activate
cd artificial-styletts2/
pip install -r requirements.txt
```

Flask `tmux-session`

```
CUDA_DEVICE_ORDER=PCI_BUS_ID HF_HOME=./hf_home CUDA_VISIBLE_DEVICES=0 python api.py
```

# Inference via API

Following examples need `api.py` to be running. [Set this IP](https://huggingface.co/dkounadis/artificial-styletts2/blob/main/tts.py#L85) to the IP shown when starting `api.py`.

## 

</details>


## Landscape 2 Soundscape

The following needs `api.py` to be already running on a tmux session. 

```python
# TTS & soundscape - overlay to .mp4
python landscape2soundscape.py
```

For SHIFT demo / Collaboration with [SMB](https://www.smb.museum/home/)
  - YouTube Videos


[![01](uc_spk_Landscape2Soundscape_Masterpieces_pics/thumb____01_Schick_AII840_001.jpg)](https://youtu.be/SSi3gUO4GtY)

[![02](uc_spk_Landscape2Soundscape_Masterpieces_pics/thumb____02_Constable_AI555_001.jpg)](https://youtu.be/2YjxAPkdXIc)

[![03](uc_spk_Landscape2Soundscape_Masterpieces_pics/thumb____03_Schinkel_WS200-002.jpg)](https://youtu.be/BhMh02knkco)



[![05](uc_spk_Landscape2Soundscape_Masterpieces_pics/thumb____05_Blechen_FV40_001.jpg)](https://youtu.be/a3qk9S87v60)

[![06](uc_spk_Landscape2Soundscape_Masterpieces_pics/thumb____06_Menzel_AI900_001.jpg)](https://youtu.be/3M0y9OYzDfU)

[![07](uc_spk_Landscape2Soundscape_Masterpieces_pics/thumb____07_Courbet_AI967_001.jpg)](https://youtu.be/OBY666_By1k)

[![08](uc_spk_Landscape2Soundscape_Masterpieces_pics/thumb____08_Monet_AI1013_001.jpg)](https://youtu.be/gnGCYLcdLsA)

[![10](uc_spk_Landscape2Soundscape_Masterpieces_pics/thumb____10_Boecklin_967648_NG2-80_001_rsz.jpg)](https://www.youtube.com/watch?v=Y8QyYUgLaCg)

[![11](uc_spk_Landscape2Soundscape_Masterpieces_pics/thumb____11_Liebermann_NG4-94_001.jpg)](https://youtu.be/XDDzxDSrhb0)

[![12](uc_spk_Landscape2Soundscape_Masterpieces_pics/thumb____12_Slevogt_AII1022_001.jpg)](https://youtu.be/I3YYKiUzHpA)




# SoundScape Live (iterative) Demo - Paplay

Special Flask API for playing sounds live

```python
CUDA_DEVICE_ORDER=PCI_BUS_ID HF_HOME=/data/dkounadis/.hf7/ CUDA_VISIBLE_DEVICES=4 python live_api.py
```

Client - Describe any sound with words and it will be played back to you.

```python
python live_demo.py  # will ask text input & play soundscape
```

# SoundScape (basic) Demo

```python
CUDA_DEVICE_ORDER=PCI_BUS_ID HF_HOME=/data/dkounadis/.hf7/ CUDA_VISIBLE_DEVICES=4 python demo.py
```

##

# Audiobook

Create audiobook from `.docx`. Listen to it - YouTube [male voice](https://www.youtube.com/watch?v=5-cpf7u18JE) / [female voice](https://www.youtube.com/watch?v=pzrLYCaWD2A)

```python
# generated audiobook will be saved in ./tts_audiobooks
python audiobook.py 
```