The OpenAI FM TTS node is a custom node for ComfyUI that seamlessly integrates the OpenAI FM Text-to-Speech service into your audio workflows. This node allows you to easily convert text to speech with a variety of voices and emotional styles directly within ComfyUI.
Description:
The OpenAI FM TTS
node is a custom node for ComfyUI that seamlessly integrates the OpenAI FM Text-to-Speech service into your audio workflows. This node allows you to easily convert text to speech with a variety of voices and emotional styles directly within ComfyUI. Enhance your projects by adding realistic and expressive voiceovers, creating dynamic audio content, or experimenting with different vocal performances.
Features:
data/voices.json
, allowing you to choose from a variety of available voices.data/vibes.json
, enabling you to generate speech with different emotional tones to match the context of your project.AUDIO
signal that is directly compatible with ComfyUI's audio processing pipeline.output
directory for easy access and later use.Installation:
Clone the repository: Open your ComfyUI custom_nodes
directory and clone this repository:
cd ComfyUI/custom_nodes
git clone https://github.com/fairy-root/ComfyUI-OpenAI-FM.git
Install dependencies: Navigate to the cloned directory and install the required Python libraries using pip. Choose the appropriate command based on your ComfyUI environment:
a. Standard ComfyUI Python: If using the standard ComfyUI Python environment:
cd ComfyUI/custom_nodes/ComfyUI-OpenAI-FM
pip install -r requirements.txt
b. Embedded ComfyUI Python: If using embedded Python (common in portable ComfyUI installs):
cd ComfyUI/custom_nodes/ComfyUI-OpenAI-FM
..\..\..\python_embeded\python.exe -m pip install -r requirements.txt
Ensure that torch
and torchaudio
are installed, as they are listed in requirements.txt
.
Restart ComfyUI: After installation, restart ComfyUI to ensure the node is loaded. You can then find the OpenAI FM TTS
node in the audio
category within ComfyUI.
Usage:
audio
, and select OpenAI FM TTS
.0
will result in a randomly generated seed for each audio generation, providing variation.data/voices.json
file.data/vibes.json
and allow you to modify the emotional tone of the generated voice.AUDIO
signal. Connect this to other audio nodes for further processing or to a Save Audio node to save the generated speech to a file.Configuration Files:
data/voices.json
: This JSON file contains a list of available voices that are loaded into the "voice" dropdown menu when ComfyUI starts. You can customize or extend the voice options by modifying this file.data/vibes.json
: This file configures the "vibe" dropdown menu and defines the emotional styles that can be applied to the voices. Each vibe setting can adjust various aspects of the voice to achieve different emotional tones.Dependencies:
requests
(for making API calls)torch
and torchaudio
(for audio processing and tensor operations)
(All dependencies are listed in requirements.txt
for easy installation)Expected Output:
AUDIO
output signal, which can be directly connected to other ComfyUI audio processing nodes or a Save Audio node. The audio is a tensor in ComfyUI format: [batch, channels, samples]
with a sample rate of 44100Hz.ComfyUI/custom_nodes/ComfyUI-OpenAI-FM/output/
. If this directory is not writable, the script will attempt to save to ComfyUI/output/
or the script directory itself. Filenames are prefixed with openaifm_
and include a timestamp for easy identification. The audio files are saved in WAV format.This project is intended for educational and personal use only. It is not affiliated with, endorsed by, or officially supported by OpenAI. Use of the OpenAI FM API is subject to their terms of service. Reverse engineering was employed to understand the API for the purpose of creating this tool. Please ensure your usage complies with all applicable terms and legal standards.
Your support is appreciated:
TGCVbSSJbwL5nyXqMuKY839LJ5q5ygn2uS
13GS1ixn2uQAmFQkte6qA5p1MQtMXre6MT
0xdbc7a7dafbb333773a5866ccf7a74da15ee654cc
Ldb6SDxUMEdYQQfRhSA3zi4dCUtfUdsPou
This project is licensed under the MIT License. See the LICENSE file for details.
Contributions are welcome! Please open an issue or submit a pull request for any improvements or features.