--- title: ai-audio-books emoji: 📕👨‍💻🎧 colorFrom: blue colorTo: gray sdk: gradio sdk_version: 4.44.1 app_file: app.py pinned: false --- ### Action items - [ ] move speaker split to new pipeline - [ ] env template - [ ] move from AI/ML api to langchain - [ ] bugfix w/ 11labs api - [ ] async synthesis - [ ] map characters to voices - [] emotion enrichment: add intonation markers, auto-set TTS params - [x] generate good enough sound effects for background - [ ] mix effects with narrration - [x] allow files uplaod (.txt) - optimizations - [ ] combine sequential phrases of same character in single phrase - [ ] support large texts. use batching. problem: how to ensure same characters? can detect characters in first prompt, then split text in each batch into character phrases - [ ] probably split large phrases into smaller ones ### Backlog - [ ] prepare text for TTS - [x] prepare prompt to split text into character phrases - [ ] split large text in batches, process each batch separatelly, concat batches - [ ] try to identify unknown characters - [ ] select voices for TTS - [ ] map characters to available voices - [ ] use LLM to recognize characters for a given text and provide descriptions detailed enough to select appropriate voice - [ ] preprocess text phrases for TTS: add intonation markers, auto-set TTS params - [ ] run TTS to create narration - [ ] add effects. mix them with created narration