ComfyUI Extension: ComfyUI-ToolBox
A collection of utility nodes for ComfyUI, including audio/video processing, file uploads, and AI image generation.
Custom Nodes (0)
README
ComfyUI-ToolBox
A collection of utility nodes for ComfyUI, including audio/video processing, file uploads, and AI image generation.
Utility Nodes
Create Image (OpenAI)
The Create Image node allows you to generate images using OpenAI's API, supporting DALL-E and GPT-Image models.
Features:
- Integration with OpenAI's DALL-E and GPT-Image image generation APIs
- Customizable image size, quality, style, and background
- Support for URL or base64 JSON response formats
- Automatic handling of API rate limits and error retries
- Support for various output formats and compression settings
- Transparent background support (with GPT-Image-1 model)
Node Parameters:
-
Required Parameters:
api_key
: OpenAI API keyprompt
: Text description for image generation (max length varies by model: 32000 chars for GPT-Image-1, 4000 chars for DALL-E-3, 1000 chars for DALL-E-2)
-
Optional Parameters:
model
: Model to use, one of "dall-e-2", "dall-e-3", or "gpt-image-1" (default: "gpt-image-1")size
: Image dimensions, with various size options that are validated against the selected model (default: "1024x1024")quality
: Image quality, options vary by model (default: "auto")- GPT-Image-1: "auto", "high", "medium", "low"
- DALL-E-3: "auto", "standard", "hd"
- DALL-E-2: "auto", "standard"
style
: Image style, only for DALL-E-3 model, one of "vivid" or "natural" (default: "vivid")background
: Background setting, only for GPT-Image-1 model, one of "auto", "transparent", or "opaque" (default: "auto")moderation
: Content moderation level, only for GPT-Image-1 model, one of "auto" or "low" (default: "auto")response_format
: Return format for DALL-E models, one of "url" or "b64_json" (default: "b64_json")output_format
: Output format for GPT-Image-1 model, one of "png", "jpeg", or "webp" (default: "png")output_compression
: Image compression level, only for GPT-Image-1 model with webp or jpeg formats, range 0-100 (default: 100)n
: Number of images to generate, range 1-10 (DALL-E-3 only supports n=1)user
: User identifier for tracking and monitoring
Output:
image
: The generated image, returned as an image object in ComfyUIb64_json
: The base64-encoded image data that can be passed to the Save Image node
Save Image (OpenAI)
The Save Image node specifically designed to handle and save OpenAI's image data.
Features:
- Decodes base64 image data and saves it to a file
- Automatically loads the saved image and outputs it for further processing
- Properly handles image format and EXIF orientation data
- Extracts alpha channel as mask (if available)
- Provides detailed debug logs
- Fully compatible with ComfyUI's Preview Image node
Node Parameters:
b64_json
: Base64-encoded image data from OpenAIfilename_prefix
: Prefix for the saved file name (default: "openai")output_dir
: Custom output directory (optional, uses ComfyUI's default output directory if empty)
Output:
filename
: Path to the saved image fileimage
: The image as a ComfyUI image object (ready for preview or further processing)mask
: Alpha channel mask (if present in the image, otherwise returns an empty mask)
Edit Image (OpenAI)
The Edit Image node allows you to edit or extend existing images using OpenAI's API, supporting both DALL-E-2 and GPT-Image-1 models.
Features:
- Takes existing images and modifies them based on a text prompt
- Dynamic input interface with adjustable number of image inputs
- Support for up to 4 separate input images when using GPT-Image-1 model
- Optional mask support to specify areas to be edited
- Supports both DALL-E-2 and GPT-Image-1 models
- Handles image format conversion and size requirements automatically
- Automatic retries for API rate limits and errors
- Returns multiple images when requested (n parameter)
Node Parameters:
-
Required Parameters:
api_key
: OpenAI API keyinputcount
: Number of image inputs to displayimage_1
: Primary input image to be edited (as ComfyUI IMAGE type)prompt
: Text description of the desired modifications (max length: 32000 chars for GPT-Image-1, 1000 chars for DALL-E-2)
-
Optional Parameters:
mask
: Optional mask to specify which areas to edit (transparent areas in the mask indicate regions to modify)model
: Model to use, one of "gpt-image-1" or "dall-e-2" (default: "gpt-image-1")n
: Number of images to generate, range 1-10size
: Image dimensions with model-specific options:- GPT-Image-1: "auto", "1024x1024", "1536x1024" (landscape), "1024x1536" (portrait)
- DALL-E-2: "256x256", "512x512", "1024x1024"
quality
: Image quality, options vary by model (default: "auto")response_format
: Return format, one of "url" or "b64_json" (default: "b64_json")user
: User identifier for tracking and monitoring
Usage:
- Set the desired number of image inputs using the
inputcount
parameter - Click "Update inputs" to update the node interface
- Connect your images to the displayed image inputs
- When using DALL-E-2 model, only the first image will be used regardless of inputcount value
- For GPT-Image-1 model, up to 4 images can be used; if inputcount is set higher, only the first 4 will be used
Output:
image
: The edited image(s), returned as an image object in ComfyUIb64_json
: The base64-encoded image data that can be passed to the Save Image node
Video Combine
The Video Combine node merges an audio file with a video file, ideal for use as the final node in a workflow.
Features:
- Takes a video file and an audio file as input
- Combines the audio with the video
- Provides various methods for handling long audio (cut off audio, bounce video, loop video)
- Customizable output filename prefix
- Merged video is saved to ComfyUI's output directory
Node Parameters:
video_file
: Path to the input video fileaudio_file
: Path to the input audio filefilename_prefix
: Prefix for the output filename, defaults to "output"audio_handling
: Method for handling audio longer than the video:cut off audio
: Truncate audio to match video durationbounce video
: Alternate forward/reverse video playback to match audio length (default)loop video
: Repeat video playback to match audio length
Output:
video_path
: Absolute path to the merged video file
AWS S3 Upload
The AWS S3 Upload node uploads local files to Amazon S3 storage service.
Features:
- Support for uploading any local file type to AWS S3 buckets
- Flexible storage path configuration with parent and subdirectory support
- Automatic directory hierarchy handling
- AWS region specification
- Returns the S3 file path after upload
Node Parameters:
bucket
: AWS S3 bucket nameaccess_key
: AWS access key IDsecret_key
: AWS secret access keyregion
: AWS region name (defaults to "us-east-1")parent_directory
: Parent directory path within the bucket (optional)file_path
: Absolute path to the local file to uploadsub_dir_name
(optional): Subdirectory name under the parent directory, if set files will be stored in the subdirectory
Directory Structure Rules:
- If both
parent_directory
andsub_dir_name
have values, files will be stored underbucket/parent_directory/sub_dir_name/
- If only
parent_directory
has a value, files will be stored underbucket/parent_directory/
- If only
sub_dir_name
has a value, files will be stored underbucket/sub_dir_name/
- If neither has a value, files will be stored directly in the bucket root
Output:
s3_file_path
: S3 path of the uploaded file (format: s3://bucket/path/to/file)
Trim Audio To Length
The Trim Audio To Length node trims audio files to a specified duration.
Features:
- Takes an audio file path as input
- Trims the audio to a user-specified duration
- If target duration exceeds original audio length, returns the original audio
- Customizable output filename prefix
- Trimmed audio is saved to ComfyUI's output directory
Node Parameters:
audio_path
: Path to the input audio filetarget_duration
: Duration in seconds to trim the audio to (range: 0.1 to 3600.0 seconds)filename_prefix
: Prefix for the output filename, defaults to "trimmed_audio"
Output:
file_path
: Absolute path to the trimmed audio file
Installation
- Make sure you have ComfyUI installed
- Ensure FFmpeg is installed on your system (for audio/video processing)
- Clone this repository into the
custom_nodes
directory of ComfyUI:
cd ComfyUI/custom_nodes
git clone https://github.com/yourusername/ComfyUI-ToolBox.git
- Install the required Python dependencies:
cd ComfyUI-ToolBox
pip install -r requirements.txt
- Restart ComfyUI
- The nodes should now be available in the ComfyUI interface under the "ToolBox" category
Dependencies
The main dependencies for this project are listed in requirements.txt
, including:
- boto3 - for AWS S3 file uploads
- FFmpeg - for audio/video processing (system-level installation required)
- requests - for API requests and file downloads
- Other Python libraries - see requirements.txt for details
Usage Examples
Create Image (OpenAI) Node Usage
- Add the "Create Image (OpenAI)" node to your ComfyUI workflow (under the ToolBox/OpenAI category)
- Enter your OpenAI API key
- Input your desired image description (prompt)
- Select a model (such as dall-e-3 or gpt-image-1)
- Configure the appropriate parameters based on your selected model (size, quality, style, background, etc.)
- Run the workflow to generate the image
- The generated image can be used directly in ComfyUI for further processing, such as editing or applying effects with other nodes
Video Combine Node Usage
- Add the "Video Combine" node to your ComfyUI workflow (under the ToolBox/Video Combine category)
- Set the paths for input video file and audio file
- Set the output filename prefix
- Choose the method for handling audio longer than the video (cut off audio, bounce video, or loop video)
- Run the workflow to get the merged video file path
- The merged video will be saved in the ComfyUI output directory
AWS S3 Upload Node Usage
- Add the "AWS S3 Upload" node to your ComfyUI workflow (under the ToolBox/AWS S3 category)
- Enter your AWS S3 bucket name, access key, and secret key
- Select the correct AWS region (e.g., "us-east-1", "eu-west-1", etc.)
- Set the parent directory path (optional) and subdirectory name (optional) to define the storage structure
- Enter the path of the local file to upload
- Run the workflow to get the S3 file path of the uploaded file
Trim Audio To Length Node Usage
- Add the "Trim Audio To Length" node to your ComfyUI workflow (under the ToolBox/Audio category)
- Set the path for the input audio file
- Specify the target duration in seconds for the trimmed audio
- Set the output filename prefix (optional, defaults to "trimmed_audio")
- Run the workflow to get the trimmed audio file path
- The trimmed audio will be saved in the ComfyUI output directory with an incrementing file number
License
This project is released under the MIT License.