ComfyUI Extension: ComfyUI Zonos TTS Node

Authored by BahaC

Created 5 months ago

Updated 5 months ago

20 stars

A ComfyUI custom node that brings Zonos Text-to-Speech capabilities to your workflows, featuring high-quality speech synthesis and voice cloning.

Custom Nodes (1)

A ComfyUI custom node that brings Zonos Text-to-Speech capabilities to your workflows, featuring high-quality speech synthesis and voice cloning.

cd ComfyUI/custom_nodes/
git clone https://github.com/BahaC/ComfyUI-ZonosTTS.git

cd ComfyUI-ZonosTTS
pip install -r requirements.txt

The node provides a simple interface for text-to-speech conversion with advanced options:

text: Input text to synthesize (String)
language: Language code selection (en-us, ja-jp)
model_name: Choice of model architecture:
- Zyphra/Zonos-v0.1-transformer: Faster, lighter model
- Zyphra/Zonos-v0.1-hybrid: Higher quality (requires additional dependencies)
audio_file: Reference audio for voice cloning (optional)
cfg_scale: Control over generation quality (1.0 - 10.0)

Models are automatically downloaded and cached in:

/workspace/ComfyUI/models/TTS/Zonos/

The node implements smart model caching:

[Text Input] -> [Zonos TTS] -> [Audio Output]

[Text Input] -> [Zonos TTS] <- [Audio File] == [Audio File]

Generated audio files are saved with unique timestamps:

output/zonos_YYYYMMDD-HHMMSS_UUID.wav

Transformer Model
- Faster inference
- Lower resource requirements
- Good for most use cases
Hybrid Model
- Higher quality output
- Requires additional dependencies
- More resource intensive

Model Download Fails
- Check your internet connection
- Ensure you have sufficient disk space
- Try manually downloading to the models directory
Voice Cloning Issues
- Ensure reference audio is clean and contains only speech
- Use WAV format for reference audio
- Keep reference audio under 30 seconds
CUDA Out of Memory
- Try using the transformer model instead of hybrid
- Reduce batch size or audio length
- Free up GPU memory from other applications

This project is licensed under the terms of the LICENSE file included in the repository.