ComfyUI Extension: ComfyUI_DiffRhythm_MW
Blazingly Fast and Embarrassingly Simple End-to-End Full-Length Song Generation. A node for ComfyUI.
Custom Nodes (6)
README
DiffRhythm Node for ComfyUI
Blazingly Fast and Embarrassingly Simple End-to-End Full-Length Song Generation.
π£ update
[2025-04-26]βοΈ: Change to manually selecting to download the muq
model.
[2025-03-21] βοΈ: Code refactored, ultra-fast generation speed. 4 minutes 45 seconds of music generated in less than 20 seconds, 1 minute 35 seconds of music generated in less than 7 seconds. Added more tunable parameters for more creative freedom. Optional model unloading.
[2025-03-16]βοΈ: Released version v2.0.0. Supports full-length music generation, 4 minutes only takes 62 seconds.
Download the model and place it in the ComfyUI\models\TTS\DiffRhythm
folder:
- DiffRhythm-full Rename the model to
cfm_full_model.pt
, and also downloadcomfig.json
and put it together.
[2025-03-13]βοΈ: Release version v1.0.0.
- All parameters are optional; you can generate random music without providing any parameters.
Installation
cd ComfyUI/custom_nodes
git clone https://github.com/billwuhao/ComfyUI_DiffRhythm.git
cd ComfyUI_DiffRhythm
pip install -r requirements.txt
# python_embeded
./python_embeded/python.exe -m pip install -r requirements.txt
Model Download
The model needs to be manually downloaded to the ComfyUI\models\TTS\DiffRhythm
folder.
The structure is as follows:
.
β cfm_full_model.pt
β cfm_model.pt
β config.json
β vae_model.pt
β
ββMuQ-large-msd-iter
β config.json
β model.safetensors
β
ββMuQ-MuLan-large
β config.json
β pytorch_model.bin
β
ββxlm-roberta-base
config.json
model.safetensors
sentencepiece.bpe.model
tokenizer.json
tokenizer_config.json
https://huggingface.co/ASLP-lab/DiffRhythm-full/tree/main β cfm_full_model.pt
and config.json
https://huggingface.co/ASLP-lab/DiffRhythm-base/blob/main/cfm_model.pt
https://huggingface.co/ASLP-lab/DiffRhythm-vae/blob/main/vae_model.pt
https://huggingface.co/OpenMuQ/MuQ-MuLan-large/tree/main
https://huggingface.co/OpenMuQ/MuQ-large-msd-iter/tree/main β .safetensors
: (https://huggingface.co/OpenMuQ/MuQ-large-msd-iter/blob/refs%2Fpr%2F1/model.safetensors)
https://huggingface.co/FacebookAI/xlm-roberta-base/tree/main
Environment Configuration
- Configure the following on Windows systems:
Download and install the latest version of espeak-ng
Add the environment variable PHONEMIZER_ESPEAK_LIBRARY
to your system. The value should be the path to the libespeak-ng.dll
file in your espeak-ng installation, for example: C:\Program Files\eSpeak NG\libespeak-ng.dll
.
- On Linux systems, you need to install the
espeak-ng
package. Execute the following command to install:
apt-get -qq -y install espeak-ng
It should support Mac, but has not been tested.
Enjoy the music! πΆ
Acknowledgements
Thanks to the DiffRhythm team for their excellent work. Currently the strongest open-source music/song generation model π.