ComfyUI Extension: ComfyUI_SparkTTS

Authored by billwuhao

Created 5 months ago

Updated 3 months ago

46 stars

Using Spark-TTS in Comfyui. Spark-TTS: An Efficient LLM-Based Text-to-Speech Model with Single-Stream Decoupled Speech Tokenss

Custom Nodes (0)

README

中文 | English

Spark-TTS ComfyUI Node

Using Spark-TTS in ComfyUI. Spark-TTS: An efficient text-to-speech model based on LLM with clone sounds from various languages.

Updates

[2025-03-21] ⚒️: Refactored code, optional model unloading, faster generation speed. Added more tunable parameters. Supports cross-lingual voice cloning.

[2025-03-07] ⚒️: Released version v1.0.0. New recording node MW Audio Recorder for Spark can be used to record audio with a microphone, and the progress bar displays the recording progress:

Installation

cd ComfyUI/custom_nodes
git clone https://github.com/billwuhao/ComfyUI_SparkTTS.git
cd ComfyUI_SparkTTS
pip install -r requirements.txt

# python_embeded
./python_embeded/python.exe -m pip install -r requirements.txt

Model Download

Download the following models to the ComfyUI\models\TTS folder.

Spark-TTS-0.5B

Move the Step-Audio-speakers folder from this repository to the ComfyUI\models\TTS folder.

The structure should look like this:

ComfyUI\models\TTS
├── Spark-TTS-0.5B
├── Step-Audio-speakers

Note: If you have already installed ComfyUI_StepAudioTTS, there’s no need to move it, as they share audio and configuration files.

You can then freely customize speakers under the ComfyUI\models\TTS\Step-Audio-speakers folder for use. Ensure that the speaker name configuration matches exactly:

Acknowledgments

Spark-TTS