ComfyUI Extension: ComfyUI-Zonos

Authored by BuffMcBigHuge

Created

Updated

34 stars

TTS with Zyphra Zonos

Custom Nodes (0)

    README

    ComfyUI-Zonos

    ComfyUI node to make text to speech audio with your own voices.

    ComfyUI-Zonos

    ** Currently only tested on Windows **

    Installation

    • You can git clone https://github.com/BuffMcBigHuge/ComfyUI-Zonos into custom_nodes or install from ComfyUI-Manager
    cd custom_nodes/ComfyUI-Zonos
    git submodule update --init --recursive
    pip install -r requirements.txt
    git clone https://github.com/Zyphra/Zonos.git
    

    You will need to install eSpeak NG on your machine.

    • Windows: You can install eSpeak NG then have to set PHONEMIZER_ESPEAK_LIBRARY=C:\Program Files\eSpeak NG\libespeak-ng.dll in your environment variables, and restart your terminal.
    • Linux: You can install via sudo apt install -y espeak-ng.
    • You may need to install CUDA Toolkit to use the GPU as well as Visual Studio Build Tools to compile.

    How To Use

    • Drop a .wav file into ComfyUI/input of a short (5-10s) clear audio of the voice you'd like to use.
    • And a .txt file of the same name with what was said.
    • Tap "R" in ComfyUI to refresh the node list.
    • Use the ZonosGenerate and queue a prompt, (Example Workflow).

    Current Issues

    • Untested on Mac/Linux
    • Model loading isn't handled by Comfy native - ymmv
    • Can't get compiling to work - will update when fixed

    Special Thanks

    • Zyphra for the Zonos model.
    • niknah for the F5-TTS node.
    • sdbds for the Zonos-for-windows gradio_interface.py