ComfyUI Extension: ComfyUI-KokoroTTS

Authored by benjiyaya

Created 6 months ago

Updated 5 months ago

57 stars

A Text To Speech node using Kokoro TTS in ComfyUI.

Custom Nodes (1)

Kokoro TextToSpeech

README

Kokoro TextToSpeech Node for ComfyUI

A custom node for ComfyUI that provides Text-to-Speech capabilities using the Kokoro TTS engine.

The basic TTS

TTS with LatentSync for Lipsync

https://github.com/user-attachments/assets/55a1cd18-ec8f-4127-8ee6-62c68e493b30

Example Result.

Features

High-quality text-to-speech synthesis
Multiple voice options
Support for multilingual text
Easy integration with ComfyUI workflows

Installation

Clone this repository into your ComfyUI custom nodes directory:

cd ComfyUI/custom_nodes
git clone https://github.com/benjiyaya/ComfyUI-KokoroTTS

Download required model files:
- Create a folder Kokorotts under ComfyUI/models
- Go to https://huggingface.co/thewh1teagle/Kokoro/tree/main
- Download the model 'kokoro-v0_19.onnx' file and save to 'Kokorotts' folder
- Download the voices 'voices.json' file and save to 'Kokorotts' folder
- Place both files in the ComfyUI/models/Kokorotts directory
Install required Python packages:

pip install -r requirements.txt

or

if you are using window protable version.

Go to 'ComfyUI_windows_portable' folder
 run the command : "python_embeded\python.exe -m pip install -r ComfyUI\custom_nodes\ComfyUI-KokoroTTS\requirements.txt"

Available Voices

The following voices are available:

af (American Female)
af_sarah (American Female Sarah)
af_bella (American Female Bella)
af_nicole (American Female Nicole)
af_sky (American Female Sky)
am_adam (American Male Adam)
am_michael (American Male Michael)
bf_emma (British Female Emma)
bf_isabella (British Female Isabella)
bm_george (British Male George)
bm_lewis (British Male Lewis)

Usage

In ComfyUI, locate the "Kokoro TextToSpeech" node under the "kokoro" category
Connect the node to your workflow
Input your text and select a voice
The node will output an audio waveform that can be used with other audio nodes

Input Parameters

text: The text you want to convert to speech (supports multiline text)
speaker: The voice to use for speech synthesis (default: af_sarah)

Output

audio: Audio data in the format expected by ComfyUI audio nodes

Error Handling

The node includes comprehensive error handling for common issues:

Missing model or voice files
Invalid text input
TTS generation failures

Error messages will be logged with detailed information to help troubleshoot any issues.

License

kokoro-onnx: MIT kokoro model: Apache 2.0

Credits

Kokoro TTS Engine: [Include credits for the original Kokoro TTS project]
ComfyUI: https://github.com/comfyanonymous/ComfyUI
ComfyUI-BS_Kokoro-onnx https://github.com/Burgstall-labs/ComfyUI-BS_Kokoro-onnx