heiscold
's Collections
TTS, VC
updated
Making Flow-Matching-Based Zero-Shot Text-to-Speech Laugh as You Like
Paper
•
2402.07383
•
Published
•
13
Matcha-TTS: A fast TTS architecture with conditional flow matching
Paper
•
2309.03199
•
Published
•
11
Natural language guidance of high-fidelity text-to-speech with synthetic
annotations
Paper
•
2402.01912
•
Published
•
11
Fast Timing-Conditioned Latent Audio Diffusion
Paper
•
2402.04825
•
Published
•
7
FlashSpeech: Efficient Zero-Shot Speech Synthesis
Paper
•
2404.14700
•
Published
•
29
Seed-TTS: A Family of High-Quality Versatile Speech Generation Models
Paper
•
2406.02430
•
Published
•
31
LiveSpeech: Low-Latency Zero-shot Text-to-Speech via Autoregressive
Modeling of Audio Discrete Codes
Paper
•
2406.02897
•
Published
•
13
VALL-E 2: Neural Codec Language Models are Human Parity Zero-Shot Text
to Speech Synthesizers
Paper
•
2406.05370
•
Published
•
15
E2 TTS: Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS
Paper
•
2406.18009
•
Published
•
20
Towards Robust Speech Representation Learning for Thousands of Languages
Paper
•
2407.00837
•
Published
•
10
Autoregressive Speech Synthesis without Vector Quantization
Paper
•
2407.08551
•
Published
•
14
Paper
•
2407.14358
•
Published
•
24
Efficient Audio Captioning with Encoder-Level Knowledge Distillation
Paper
•
2407.14329
•
Published
•
4
Paper
•
2407.15595
•
Published
•
13
Speech Slytherin: Examining the Performance and Efficiency of Mamba for
Speech Separation, Recognition, and Synthesis
Paper
•
2407.09732
•
Published
•
8
Qwen2-Audio Technical Report
Paper
•
2407.10759
•
Published
•
55
SpeechX: Neural Codec Language Model as a Versatile Speech Transformer
Paper
•
2308.06873
•
Published
•
25
MulliVC: Multi-lingual Voice Conversion With Cycle Consistency
Paper
•
2408.04708
•
Published
•
6
Audio Match Cutting: Finding and Creating Matching Audio Transitions in
Movies and Videos
Paper
•
2408.10998
•
Published
•
8
Accelerating High-Fidelity Waveform Generation via Adversarial Flow
Matching Optimization
Paper
•
2408.08019
•
Published
•
10
PeriodWave: Multi-Period Flow Matching for High-Fidelity Waveform
Generation
Paper
•
2408.07547
•
Published
•
7
Meta Flow Matching: Integrating Vector Fields on the Wasserstein
Manifold
Paper
•
2408.14608
•
Published
•
7
No Training, No Problem: Rethinking Classifier-Free Guidance for
Diffusion Models
Paper
•
2407.02687
•
Published
•
22
Mini-Omni: Language Models Can Hear, Talk While Thinking in Streaming
Paper
•
2408.16725
•
Published
•
52