ComfyUI Extension: ComfyUI_SparkTTS

Authored by billwuhao

Created

Updated

37 stars

Using Spark-TTS in Comfyui. Spark-TTS: An Efficient LLM-Based Text-to-Speech Model with Single-Stream Decoupled Speech Tokenss

Custom Nodes (0)

    README

    中文 | English

    Spark-TTS ComfyUI Node

    Using Spark-TTS in ComfyUI. Spark-TTS: An efficient text-to-speech model based on LLM with clone sounds from various languages.

    Updates

    [2025-03-21] ⚒️: Refactored code, optional model unloading, faster generation speed. Added more tunable parameters. Supports cross-lingual voice cloning.

    [2025-03-07] ⚒️: Released version v1.0.0. New recording node MW Audio Recorder for Spark can be used to record audio with a microphone, and the progress bar displays the recording progress:

    Installation

    cd ComfyUI/custom_nodes
    git clone https://github.com/billwuhao/ComfyUI_SparkTTS.git
    cd ComfyUI_SparkTTS
    pip install -r requirements.txt
    
    # python_embeded
    ./python_embeded/python.exe -m pip install -r requirements.txt
    

    Model Download

    Download the following models to the ComfyUI\models\TTS folder.

    Spark-TTS-0.5B

    Move the Step-Audio-speakers folder from this repository to the ComfyUI\models\TTS folder.

    The structure should look like this:

    ComfyUI\models\TTS
    ├── Spark-TTS-0.5B
    ├── Step-Audio-speakers
    

    Note: If you have already installed ComfyUI_StepAudioTTS, there’s no need to move it, as they share audio and configuration files.

    You can then freely customize speakers under the ComfyUI\models\TTS\Step-Audio-speakers folder for use. Ensure that the speaker name configuration matches exactly:

    Acknowledgments

    Spark-TTS