Custom ComfyUI Nodes for TTS with Kokoro, genenrate and merge speakers for new style generations.
<a href="https://www.buymeacoffee.com/stavsapq" target="_blank"><img src="https://cdn.buymeacoffee.com/buttons/default-orange.png" alt="Buy Me A Coffee" height="40" width="174"></a>
<img src="https://img.shields.io/badge/v1.0-green.svg?style=for-the-badge&labelColor=gray&label=Kokoro&color=blue" alt=""/> <img src="https://img.shields.io/badge/0.4.2-green.svg?style=for-the-badge&labelColor=gray&label=Kokoro-onnx&color=blue" alt=""/>Kokoro TTS nodes, wrapping this kokoro onnx that is based on hexgrad/Kokoro-82M.
note: This picture is also a workflow, just download and drop it into comfy.
Install Via ComfyUI Manager, by stavsap
.
Or
Clone the repo into custom_nodes
folder
git clone https://github.com/stavsap/comfyui-kokoro.git
Then cd into comfyui-kokoro
, and install requirements.
pip install -r requirements.txt
And finally reboot Comfy.
The onnx model and speakers meta-data will be automatically downloaded on the first run.
If using windows portable version and experience issues with dependencies, check the following:
Currently, there are 3 nodes that can be combined for TTS workflow.
Select supported speakers.
Combiner node to combine 2 given speakers to new speaker.
speaker a
.Example:
weight == 0.7
will result in strength of 70% of speaker_a
and 30% of speaker_b
.
All supported voices can be found here.
TTS: Text To Speach, generate voice from test.
Lip Sync: sync lips of videos