ComfyUI Extension: TTS WebUI API nodes for ComfyUI
Nodes: TTS WebUI Kokoro, TTS WebUI Chatterbox, TTS WebUI StyleTTS2, TTS WebUI Kitten TTS, TTS WebUI F5-TTS, TTS WebUI Preset Add API key to environment variable 'TTS_WEBUI_OPENAI_API_KEY' Alternatively you can write your API key to file 'tts_api_key.txt' You can also use and/or override the above by entering your API key in the 'api_key' field of each node.
Custom Nodes (0)
README
ComfyUI: TTS WebUI (OpenAI-style) nodes
This package provides ComfyUI custom nodes that call a TTS WebUI OpenAI-style HTTP endpoint (POST /v1/audio/speech) and return in-memory AUDIO usable in ComfyUI workflows.

What it does
- Sends text + options to a TTS WebUI endpoint and requests WAV output.
- Converts the returned WAV bytes into a ComfyUI AUDIO object (torch.Tensor waveform + sample rate).
- Optionally saves the WAV to the ComfyUI output/audio folder and returns metadata JSON.
Nodes included
- TTS WebUI via API — class
TTSWebUI_OpenAI_TTS(generic, "jack of all trades") - TTS Kokoro — class
TTS_KokoroNode - TTS Chatterbox — class
TTS_ChatterboxNode - TTS StyleTTS2 — class
TTS_StyleTTS2Node - TTS Kitten TTS — class
TTS_KittenTTSNode - TTS F5-TTS — class
TTS_F5TTSNode - TTS Preset — class
TTS_GlobalPresetNode
All nodes share a minimal helper (NodeTTSBase) for HTTP and response handling. Each node builds its own JSON payload explicitly to avoid hidden cross-file magic. Common required inputs text and api_base are provided via a small required_inputs() helper on the base class to reduce duplication.
Common required inputs
- text (STRING, multiline) — text to synthesize
- api_base (STRING) — base URL of your TTS WebUI (default:
http://127.0.0.1:7778)
Common optional inputs
- api_key (STRING, password) — API key for the TTS server (if required)
- timeout_sec (INT) — request timeout in seconds
- channels_first (BOOLEAN) — AUDIO layout: (C, N) if True else (N, C)
- also_save_wav (BOOLEAN) — if True, saves a WAV under output/audio and returns the path
- save_prefix (STRING) — filename prefix for saved WAVs
- return_metadata (BOOLEAN) — if True, returns a metadata JSON string
Model-specific nodes add their own inputs (e.g., voice, speed, alpha/beta, etc.) and build the corresponding payload under the hood.
Advanced: raw payload override (generic node)
TTSWebUI_OpenAI_TTS has an advanced_request_json input to fully override the default payload. Example body sent to /v1/audio/speech:
{
"model": "kokoro",
"input": "Hello from ComfyUI",
"voice": "random",
"speed": 1.0,
"response_format": "wav",
"params": {"temperature": 0.7}
}
Installation / Requirements
- Place this folder inside your ComfyUI
custom_nodesdirectory. - Ensure the environment has these Python packages installed:
requests,numpy,torch.- If your ComfyUI environment already has
torchinstalled, no extra action is required.
- If your ComfyUI environment already has
API key detection
If your TTS server requires an API key you can either:
- pass it into the
api_keyinput, or - set the environment variable
TTS_WEBUI_OPENAI_API_KEYto your API key (preferred).
For backward compatibility the module will also check SAI_API_KEY. As a local-file fallback the helper also reads tts_api_key.txt or sai_platform_key.txt placed next to the custom node. When a key is found, requests include the header Authorization: Bearer <key>.
Layout
nodes/— one node per file (generic + per-model)nodes/_base.py—NodeTTSBaseproviding HTTP and finalize helpers andrequired_inputs()utils/— small helpers for audio I/O, paths, filenames__init__.py— package exports for ComfyUI
Notes
This repository originally included a Stability API template. The TTS nodes are independent of that code; any legacy modules (tts_base.py, tts_models.py, nodes/_shared.py) are now deprecated and will raise a clear ImportError if imported.
License: see LICENSE