TTS with emotional speech capabilities in 8 Languages 24 speakers.
2 custom nodes for ComfyUI that enables text-to-speech generation using the GGUF Orpheus model with emotional speech capabilities.
<img src="doc/demo.png" width="100%">English Voices:
supported tags : chuckle, cough, gasp, groan, laugh, sigh, sniffle, yawn
tara
- Female voiceleah
- Female voicejess
- Female voiceleo
- Male voicedan
- Male voicemia
- Female voicezac
- Male voicezoe
- Female voiceFrench Voices:
supported tags : chuckle, cough, gasp, groan, laugh, sigh, sniffle, whimper, yawn
pierre
- Male voiceamelie
- Female voicemarie
- Female voice (doesn't works well)German Voices:
supported tags : chuckle, cough, gasp, groan, laugh, sigh, sniffle, yawn
jana
- Female voicethomas
- Male voicemax
- Male voiceKorean Voices:
supported tags : 한숨, 헐, 헛기침, 훌쩍, 하품, 낄낄, 신음, 작은 웃음, 기침, 으르렁
유나
- ?^^준서
- ?^^Chinese Voices:
supported tags : 嬉笑, 轻笑, 呻吟, 大笑, 咳嗽, 抽鼻子, 咳
长乐
- ?^^白芷
- ?^^Hindi:
supported tags : unknow
ऋतिका
- ? ^^Spanish Voices:
supported tags : groan, chuckle, gasp, resoplido, laugh, yawn, cough
javi
- Male voicesergio
- Male voicemaria
- Female voiceItalian Voices:
supported tags : sigh, laugh, cough, sniffle, groan, yawn, gemito, gasp
pietro
- Male voicegiulia
- Female voicecarlo
- Male voicecd ComfyUI/custom_nodes
git clone https://github.com/your-repo/ComfyUI-Orpheus.git
load venv and :
pip install -r ComfyUI-Orpheus/requirements.txt
Or use python_embeded :
python_embeded\python.exe -m pip install -r ComfyUI-Orpheus/requirements.txt
Download the required GGUF model from FreddyAboulton huggingface page, and place it in your ComfyUI models directory under models/unet/
.
GPU Support
On windows, default installation of llama-cpp-python doesn't take NVIDIA GPU support. If you want NVIDIA GPU Support you need to locate nvcc.exe
folder and:
set CMAKE_ARGS="-DGGML_CUDA=on"
set CUDA_CXX="YOUR_CUDA_DIR\v12.x.x\bin\nvcc.exe"
python_embeded\python.exe -m pip install llama-cpp-python[server] --upgrade --force-reinstall --no-cache-dir
Be patient, can takes time...
In ComfyUI, locate the "Orpheus ⛓️" node in the node menu.
Configure the node parameters:
model_name
: Select your GGUF modelvoice
: Choose from available voicesprompt
: Enter the text you want to convert to speech, you can add emotive tags :<laugh>, <chuckle>, <sigh>, <cough>, <sniffle>, <groan>, <yawn>, <gasp>Connect the node outputs:
audio
: Contains the generated audio waveform and sample rate