Spaces:
Sleeping
Sleeping
Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,41 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
title: ai-audio-books
|
3 |
+
emoji: ππ¨βπ»π§
|
4 |
+
colorFrom: blue
|
5 |
+
colorTo: gray
|
6 |
+
sdk: gradio
|
7 |
+
sdk_version: 4.44.1
|
8 |
+
app_file: app.py
|
9 |
+
pinned: false
|
10 |
+
---
|
11 |
+
|
12 |
+
### Action items
|
13 |
+
- [ ] move speaker split to new pipeline
|
14 |
+
- [ ] env template
|
15 |
+
- [ ] move from AI/ML api to langchain
|
16 |
+
- [ ] bugfix w/ 11labs api
|
17 |
+
- [ ] async synthesis
|
18 |
+
- [ ] map characters to voices
|
19 |
+
- [] emotion enrichment: add intonation markers, auto-set TTS params
|
20 |
+
- [x] generate good enough sound effects for background
|
21 |
+
- [ ] mix effects with narrration
|
22 |
+
- [x] allow files uplaod (.txt)
|
23 |
+
- optimizations
|
24 |
+
- [ ] combine sequential phrases of same character in single phrase
|
25 |
+
- [ ] support large texts. use batching. problem: how to ensure same characters?
|
26 |
+
can detect characters in first prompt, then split text in each batch into character phrases
|
27 |
+
- [ ] probably split large phrases into smaller ones
|
28 |
+
|
29 |
+
### Backlog
|
30 |
+
- [ ] prepare text for TTS
|
31 |
+
- [x] prepare prompt to split text into character phrases
|
32 |
+
- [ ] split large text in batches, process each batch separatelly, concat batches
|
33 |
+
- [ ] try to identify unknown characters
|
34 |
+
- [ ] select voices for TTS
|
35 |
+
- [ ] map characters to available voices
|
36 |
+
- [ ] use LLM to recognize characters for a given text and provide descriptions
|
37 |
+
detailed enough to select appropriate voice
|
38 |
+
- [ ] preprocess text phrases for TTS: add intonation markers, auto-set TTS params
|
39 |
+
- [ ] run TTS to create narration
|
40 |
+
- [ ] add effects. mix them with created narration
|
41 |
+
|