ComfyUI Extension: ComfyUI-Orpheus

Authored by numz

Created 4 months ago

Updated 4 months ago

8 stars

TTS with emotional speech capabilities in 8 Languages 24 speakers.

Custom Nodes (0)

README

ComfyUI-Orpheus Node

2 custom nodes for ComfyUI that enables text-to-speech generation using the GGUF Orpheus model with emotional speech capabilities.

Features

High-quality text-to-speech synthesis
Multiple voice options (24 different voices, depend language used) in English, French, Spanish, Italian, Chinese, Korean, German, Hindi
Emotional speech capabilities
Seamless integration with ComfyUI workflow

Available Voices

English Voices:

supported tags : chuckle, cough, gasp, groan, laugh, sigh, sniffle, yawn
- tara - Female voice
- leah - Female voice
- jess - Female voice
- leo - Male voice
- dan - Male voice
- mia - Female voice
- zac - Male voice
- zoe - Female voice
French Voices:

supported tags : chuckle, cough, gasp, groan, laugh, sigh, sniffle, whimper, yawn
- pierre - Male voice
- amelie - Female voice
- marie - Female voice (doesn't works well)
German Voices:

supported tags : chuckle, cough, gasp, groan, laugh, sigh, sniffle, yawn
- jana - Female voice
- thomas - Male voice
- max - Male voice
Korean Voices:

supported tags : 한숨, 헐, 헛기침, 훌쩍, 하품, 낄낄, 신음, 작은 웃음, 기침, 으르렁
- 유나 - ?^^
- 준서 - ?^^
Chinese Voices:

supported tags : 嬉笑, 轻笑, 呻吟, 大笑, 咳嗽, 抽鼻子, 咳
- 长乐 - ?^^
- 白芷 - ?^^
Hindi:

supported tags : unknow
- ऋतिका - ? ^^
Spanish Voices:

supported tags : groan, chuckle, gasp, resoplido, laugh, yawn, cough
- javi - Male voice
- sergio - Male voice
- maria - Female voice
Italian Voices:

supported tags : sigh, laugh, cough, sniffle, groan, yawn, gemito, gasp
- pietro - Male voice
- giulia - Female voice
- carlo - Male voice

Requirements

Last ComfyUI version with python 3.12.9 (may be works with older versions but I haven't test it)

Installation

Clone this repository into your ComfyUI custom nodes directory:

cd ComfyUI/custom_nodes
git clone https://github.com/your-repo/ComfyUI-Orpheus.git

Install the required dependencies:

load venv and :

pip install -r ComfyUI-Orpheus/requirements.txt

Or use python_embeded :

python_embeded\python.exe -m pip install -r ComfyUI-Orpheus/requirements.txt

Download the required GGUF model from FreddyAboulton huggingface page, and place it in your ComfyUI models directory under models/unet/.
<img src="doc/models.png" width="100%">
GPU Support

On windows, default installation of llama-cpp-python doesn't take NVIDIA GPU support. If you want NVIDIA GPU Support you need to locate nvcc.exe folder and:

set CMAKE_ARGS="-DGGML_CUDA=on"
set CUDA_CXX="YOUR_CUDA_DIR\v12.x.x\bin\nvcc.exe"
python_embeded\python.exe -m pip install llama-cpp-python[server] --upgrade --force-reinstall --no-cache-dir

Be patient, can takes time...

Usage

In ComfyUI, locate the "Orpheus ⛓️" node in the node menu.
Configure the node parameters:
- model_name: Select your GGUF model
- voice: Choose from available voices
- prompt: Enter the text you want to convert to speech, you can add emotive tags :<laugh>, <chuckle>, <sigh>, <cough>, <sniffle>, <groan>, <yawn>, <gasp>
Connect the node outputs:
- audio: Contains the generated audio waveform and sample rate

Limitations

Maximum text length determined by MAX_TOKENS
Processing speed depends on GPU capabilities
Requires CUDA support for optimal performance

Credits

Original Orpheus implementation
Freddy Aboulton