Blazingly Fast and Embarrassingly Simple End-to-End Full-Length Song Generation. A node for ComfyUI.
Blazingly Fast and Embarrassingly Simple End-to-End Full-Length Song Generation.
[2025-03-21] βοΈ: Code refactored, ultra-fast generation speed. 4 minutes 45 seconds of music generated in less than 20 seconds, 1 minute 35 seconds of music generated in less than 7 seconds. Added more tunable parameters for more creative freedom. Optional model unloading.
[2025-03-16]βοΈ: Released version v2.0.0. Supports full-length music generation, 4 minutes only takes 62 seconds.
Download the model and place it in the ComfyUI\models\TTS\DiffRhythm
folder:
cfm_full_model.pt
, and also download comfig.json
and put it together.[2025-03-13]βοΈ: Release version v1.0.0.
cd ComfyUI/custom_nodes
git clone https://github.com/billwuhao/ComfyUI_DiffRhythm.git
cd ComfyUI_DiffRhythm
pip install -r requirements.txt
# python_embeded
./python_embeded/python.exe -m pip install -r requirements.txt
Models will be automatically downloaded to the ComfyUI\models\TTS\DiffRhythm
folder.
The structure is as follows:
Manual Download Addresses:
https://huggingface.co/ASLP-lab/DiffRhythm-base/blob/main/cfm_model.pt
https://huggingface.co/ASLP-lab/DiffRhythm-vae/blob/main/vae_model.pt
https://huggingface.co/OpenMuQ/MuQ-MuLan-large/tree/main
https://huggingface.co/OpenMuQ/MuQ-large-msd-iter/tree/main
https://huggingface.co/FacebookAI/xlm-roberta-base/tree/main
Download and install the latest version of espeak-ng
Add the environment variable PHONEMIZER_ESPEAK_LIBRARY
to your system. The value should be the path to the libespeak-ng.dll
file in your espeak-ng installation, for example: C:\Program Files\eSpeak NG\libespeak-ng.dll
.
espeak-ng
package. Execute the following command to install:apt-get -qq -y install espeak-ng
It should support Mac, but has not been tested.
Enjoy the music! πΆ
Thanks to the DiffRhythm team for their excellent work. Currently the strongest open-source music/song generation model π.