Official Hugging Face Diffusers Implementation of QA-MDT

QAMDT: Quality-Aware Diffusion for Text-to-Music ๐ŸŽถ

QADMT brings a new approach to text-to-music generation by using quality-aware training to tackle issues like low-fidelity audio and weak labeling in datasets.

With a masked diffusion transformer (MDT), QADMT delivers SOTA results on MusicCaps and Song-Describer, enhancing both quality and musicality.

It follows from this paper by the University of Science and Technology of China, authored by @changli et al..

Usage:

!git lfs install
!git clone https://huggingface.co/jadechoghari/openmusic qa_mdt

This command will change the folder name from openmusic to qa_mdt

pip install -r qa_mdt/requirements.txt
pip install xformers==0.0.26.post1
pip install torchlibrosa==0.0.9 librosa==0.9.2
pip install -q pytorch_lightning==2.1.3 torchlibrosa==0.0.9 librosa==0.9.2 ftfy==6.1.1 braceexpand
pip install torch==2.3.0+cu121 torchvision==0.18.0+cu121 torchaudio==2.3.0 --index-url https://download.pytorch.org/whl/cu121
from qa_mdt.pipeline import MOSDiffusionPipeline

pipe = MOSDiffusionPipeline()
pipe("A modern synthesizer creating futuristic soundscapes.")

Enjoy the music!! ๐ŸŽถ

Downloads last month
104
Inference Examples
Inference API (serverless) does not yet support diffusers models for this pipeline type.

Spaces using jadechoghari/openmusic 2