ComfyUI Extension: ComfyUI-Text2Speech

Authored by GeekatplayStudio

Created about a month ago

Updated about a month ago

0 stars

A ComfyUI custom node for text-to-speech integration with a local TTS server.

Custom Nodes (0)

GeekatPlay Studio

High-quality text-to-speech integration for ComfyUI workflows using Microsoft Edge TTS.

High-Quality Neural Voices: Powered by Microsoft Edge TTS with natural-sounding voices
13 Voice Options: Multiple English variants (US, GB, AU, CA, IN) with male and female voices
Flexible Input: Direct text input or load from text file
Customizable Output: Choose output directory or use ComfyUI default
Adjustable Parameters: Control speech rate (50-400) and volume (0.0-1.0)
Server Health Check: Built-in status node to verify server connectivity
Fallback Support: Automatic fallback to pyttsx3 if Edge TTS is unavailable

Clone or copy this folder to your ComfyUI custom_nodes directory
Install dependencies:
```
pip install -r requirements.txt
```
Start the TTS server:
```
python tts_server.py
```
The server will run on http://127.0.0.1:5002

HttpTTSToAudio (Category: geekatplay/TTS)

Converts text to speech audio file
Required inputs:
- text: Text to convert to speech (multiline)
- language: Language code (default: "en")
- server_url: TTS server endpoint (default: http://127.0.0.1:5002/tts)
Optional inputs:
- text_file_path: Load text from file (file picker)
- output_directory: Save location (directory picker)
- voice: Voice selection dropdown (13 Edge TTS voices)
- rate: Speech rate 50-400 (default: 180)
- volume: Volume 0.0-1.0 (default: 1.0)
- timeout_seconds: Request timeout 60-3600 seconds (default: 300)
- auto_timeout: Auto-adjust timeout based on text length (default: True)
Output: audio_path (STRING) - Path to generated WAV file

TTSServerStatus (Category: geekatplay/TTS)

Primary Engine: Microsoft Edge TTS (requires internet for high-quality neural voices)
Fallback Engine: pyttsx3 (offline, uses system voices)
Output Format: WAV audio files
Server: Flask on localhost:5002

Server not running:

No audio generated:

Voice not working:

Timeout on long files:

Enable auto_timeout (default: True) to automatically scale timeout based on text length
Manually increase timeout_seconds (up to 3600) for very large scripts
For extremely long texts (>10k words), consider splitting into smaller chunks

GeekatPlay Studio

Tutorials, workflows, and more custom nodes available on the YouTube channels!