ComfyUI Extension: ComfyUI-JM-Gemini-API
A custom node for ComfyUI that generates images using Google's Gemini API, supporting both text-to-image and image-to-image generation.
Custom Nodes (0)
README
ComfyUI-JM-Gemini-API
A custom node for ComfyUI that generates images using Google's Gemini API, supporting both text-to-image and image-to-image generation.
English | 简体中文
Features
- Support for multiple Gemini models:
gemini-3-pro-image-preview(default, with 2K resolution)gemini-2.5-flash-image
- Text-to-image generation
- Image-to-image generation (single or multiple input images)
- Image editing mode (single image input)
- Configurable aspect ratios (1:1, 2:3, 3:2, 3:4, 4:3, 4:5, 5:4, 9:16, 16:9, 21:9)
- Resolution control (1K, 2K, 4K) - only for gemini-3-pro-image-preview
- Support up to 10 input images
- Automatic image saving to ComfyUI output directory
Project Structure
ComfyUI-JM-Gemini-API/
├── __init__.py # Main entry point for ComfyUI
├── nodes/ # Node implementations directory
│ ├── __init__.py # Nodes package initializer
│ └── jm_gemini_node.py # Gemini image generator node
├── requirements.txt # Python dependencies
├── README.md # Documentation
└── .gitignore # Git ignore rules
This modular structure makes it easy to add more Gemini-related nodes in the future. Simply add new node files to the nodes/ directory and import them in nodes/__init__.py.
Installation
- Clone or download this repository to your ComfyUI custom_nodes directory:
cd ComfyUI/custom_nodes
git clone https://github.com/yourusername/ComfyUI-JM-Gemini-API.git
- Install required dependencies:
cd ComfyUI-JM-Gemini-API
pip install -r requirements.txt
- Restart ComfyUI
Requirements
- Python 3.8+
- ComfyUI
- google-genai >= 0.2.0
- Pillow >= 10.0.0
- torch
- torchvision
- numpy
- httpx[socks] (for proxy support)
Proxy Configuration (Optional)
If you need to use a proxy to access the Gemini API (common for users in China), you can set environment variables:
HTTP/HTTPS Proxy
export HTTP_PROXY="http://your-proxy:port"
export HTTPS_PROXY="http://your-proxy:port"
SOCKS5 Proxy
export HTTP_PROXY="socks5://your-proxy:port"
export HTTPS_PROXY="socks5://your-proxy:port"
Or set them before starting ComfyUI:
HTTP_PROXY="socks5://127.0.0.1:1080" HTTPS_PROXY="socks5://127.0.0.1:1080" python main.py
Usage
Getting Gemini API Key
- Visit Google AI Studio
- Sign in with your Google account
- Create a new API key
- Copy the API key for use in the node
Node Parameters
Required Inputs:
- gemini_api_key: Your Gemini API key (string)
- prompt: Text prompt describing the image you want to generate (multiline text)
- model: Choose between:
gemini-3-pro-image-preview(default, supports 1K/2K/4K resolution)gemini-2.5-flash-image(faster, aspect ratio only)
- aspect_ratio: Image aspect ratio (1:1, 2:3, 3:2, 3:4, 4:3, 4:5, 5:4, 9:16, 16:9, 21:9)
- resolution: Image resolution (1K, 2K, 4K)
- Note: Only effective for
gemini-3-pro-image-previewmodel - Default: 2K
- Note: Only effective for
Optional Inputs:
- image1 ~ image10: Up to 10 optional image inputs for image-to-image generation
- Connect output from Load Image node
- Can be left empty for text-to-image generation
Outputs:
- image: Generated image (ComfyUI IMAGE tensor format)
- Can be connected to Preview Image or Save Image nodes
- Automatically saved to ComfyUI output directory
Usage Examples
1. Text-to-Image Generation
- Add "JM Gemini Image Generator" node to your workflow
- Enter your Gemini API key
- Write your prompt
- Select model and aspect ratio
- Leave all image inputs empty
- Connect output to Preview Image node
- Run the workflow
2. Image Editing (Single Image)
- Add Load Image node and load your image
- Add "JM Gemini Image Generator" node
- Connect Load Image output to image1 input
- Enter editing instructions in prompt (e.g., "Add a sunset background")
- Configure model and parameters
- Run the workflow
3. Image-to-Image (Multiple Images)
- Add multiple Load Image nodes
- Connect them to image1, image2, etc.
- Enter a prompt describing how to combine/transform the images
- Run the workflow
Model Differences
gemini-3-pro-image-preview
- Supports resolution parameter (1K, 2K, 4K)
- Default resolution: 2K
- Supports single image editing
- Higher quality output
gemini-2.5-flash-image
- Faster generation
- Uses aspect ratio only (no resolution parameter)
- Good for quick iterations
Troubleshooting
Common Issues
-
"Gemini API key is required"
- Make sure you've entered a valid API key
-
"No images were generated"
- Check your prompt is clear and descriptive
- Try a different aspect ratio or resolution
- Verify your API key is valid and has sufficient quota
-
"Failed to generate image"
- Check your internet connection
- Verify API key permissions
- Check Gemini API service status
-
Image quality issues
- For gemini-3-pro-image-preview, try using different resolutions (1K, 2K, or 4K)
- Make your prompt more detailed and specific
Output Directory
Generated images are automatically saved to:
ComfyUI/output/directory- Filename format:
{model}_{mode}_{timestamp}.png- Example:
gemini3pro_text2img_1234567890.png
- Example:
License
MIT License
Credits
Developed by JM
Based on Google's Gemini API
Support
For issues and feature requests, please visit the GitHub repository
Changelog
Version 1.0.0
- Initial release
- Support for gemini-3-pro-image-preview and gemini-2.5-flash-image
- Text-to-image generation
- Image-to-image generation (up to 10 images)
- Configurable aspect ratios and resolutions