ComfyUI Extension: ComfyUI_Step1X-Edit

Authored by raykindle

Created 3 months ago

Updated 3 months ago

48 stars

This custom node integrates the a/Step1X-Edit image editing model into ComfyUI. Step1X-Edit is a state-of-the-art image editing model that processes a reference image and user's editing instruction to generate a new image.

Custom Nodes (0)

README

ComfyUI_Step1X-Edit

English | 中文文档

This custom node integrates the Step1X-Edit image editing model into ComfyUI. Step1X-Edit is a state-of-the-art image editing model that processes a reference image and user's editing instruction to generate a new image.

Features

[x] Support for FP8 inference
[x] Support for flash-attn acceleration
[x] Support TeaCache acceleration for 2x faster inference with minimal quality loss

Examples

Here are some examples of what you can achieve with ComfyUI_Step1X-Edit:

<table> <tr> <th colspan="2" style="text-align: center">Example 1: "Add pendant with a ruby around this girl's neck."</th> </tr> <tr> <th style="text-align: center">Native</th> <th style="text-align: center">~1.5x Speedup（threshold=0.25）</th> </tr> <tr> <td><img src="examples/0000.jpg" alt="Example Image1"></td> <td><img src="examples/0000_fast_0.25.jpg" alt="Example Image2"></td> </tr> </table> <table> <tr> <th colspan="2" style="text-align: center">Example 2: "Let her cry."</th> </tr> <tr> <th style="text-align: center">Native</th> <th style="text-align: center">~1.5x Speedup（threshold=0.25）</th> </tr> <tr> <td><img src="examples/0001.jpg" alt="Example Image1"></td> <td><img src="examples/0001_fast_0.25.jpg" alt="Example Image2"></td> </tr> </table>

You can find the example workflow in the examples directory. Load it directly into ComfyUI to see how it works.

Installation

Clone this repository into your ComfyUI's custom_nodes directory:

cd ComfyUI/custom_nodes
git clone https://github.com/raykindle/ComfyUI_Step1X-Edit.git

Install the required dependencies:

Step 1: Install ComfyUI_Step1X-Edit dependencies
```
cd ComfyUI_Step1X-Edit
pip install -r requirements.txt
```
Step 2: Install flash-attn, here we provide a script to help find the pre-built wheel suitable for your system.
```
python utils/get_flash_attn.py
```
The script will generate a wheel name like flash_attn-2.7.2.post1+cu12torch2.5cxx11abiFALSE-cp310-cp310-linux_x86_64.whl, which could be found in:
- Linux version: Dao-AILab's flash-attn releases
- Windows version: kingbri1's flash-attn releases
Then you can download the corresponding pre-built wheel and install it following the instructions in flash-attn.

Note: Even if the CUDA and Torch versions don't match exactly, you can still successfully install flash-attention. However, for optimal performance and compatibility, it's recommended to use versions that match your system exactly.
Download Step1X-Edit-FP8 model
```
ComfyUI/
└── models/
    ├── diffusion_models/
    │   └── step1x-edit-i1258-FP8.safetensors
    ├── vae/
    │   └── vae.safetensors
    └── text_encoders/
        └── Qwen2.5-VL-7B-Instruct/
```
- Step1X-Edit diffusion model: Download step1x-edit-i1258-FP8.safetensors from HuggingFace and place it in ComfyUI's models/diffusion_models directory
- Step1X-Edit VAE: Download vae.safetensors from HuggingFace and place it in ComfyUI's models/vae directory
- Qwen2.5-VL model: Download Qwen2.5-VL-7B-Instruct and place it in ComfyUI's models/text_encoders/Qwen2.5-VL-7B-Instruct directory

Usage

Start ComfyUI and create a new workflow.
Add the "Step1X-Edit Model Loader" node (or the faster "Step1X-Edit TeaCache Model Loader" for 2x speedup) to your workflow.
Configure the model parameters:
- Select step1x-edit-i1258-FP8.safetensors as the diffusion model
- Select vae.safetensors as the VAE
- Set Qwen2.5-VL-7B-Instruct as the text encoder
- Set additional parameters (dtype, quantized, offload) as needed
- If using TeaCache, set an appropriate threshold value
Connect a "Step1X-Edit Generate" node (or "Step1X-Edit TeaCache Generate" if using TeaCache) to the model node.
Provide an input image and an editing prompt.
Run the workflow to generate edited images.

Parameters

Step1X-Edit Model Loader

diffusion_model: The Step1X-Edit diffusion model file (select from the diffusion_models dropdown)
vae: The Step1X-Edit VAE file (select from the vae dropdown)
text_encoder: The path to the Qwen2.5-VL model directory name (e.g., "Qwen2.5-VL-7B-Instruct")
dtype: Model precision (bfloat16, float16, or float32)
quantized: Whether to use FP8 quantized weights (true recommended)
offload: Whether to offload models to CPU when not in use

Step1X-Edit TeaCache Model Loader (Additional Parameters)

teacache_threshold: Controls the trade-off between speed and quality
- 0.25: ~1.5x speedup
- 0.4: ~1.8x speedup
- 0.6: 2x speedup (recommended)
- 0.8: ~2.25x speedup with minimal quality loss
verbose: Whether to print TeaCache debug information

Step1X-Edit Generate / Step1X-Edit TeaCache Generate

model: The Step1X-Edit model bundle
image: The input image to edit
prompt: Text instructions describing the desired edit
negative_prompt: Text describing what to avoid
steps: Number of denoising steps (more steps = better quality but slower)
cfg_scale: Guidance scale (how closely to follow the prompt)
image_size: Size of the output image (512 recommended)
seed: Random seed for reproducibility

TeaCache Acceleration

This implementation includes TeaCache acceleration technology, which provides:

2x faster inference with no quality loss
Training-free acceleration with no additional model fine-tuning
Adaptive caching based on timestep embeddings
Adjustable speed-quality trade-off via threshold parameter

TeaCache works by intelligently skipping redundant calculations during the denoising process. It analyzes the relative changes between steps and reuses previously computed results when possible, significantly reducing computational requirements without compromising output quality.

Based on TeaCache research, which was developed for accelerating video diffusion models and adapted here for image generation.

Memory Requirements

The Step1X-Edit model requires significant GPU memory: (768px, 10 step)

| Model Version | Peak GPU Memory | Native | ~1.5x Speedup（threshold=0.25） | ~2.0x Speedup（threshold=0.6） | |:------------:|:------------:|:------------:|:------------:|:------------:| | Step1X-Edit-FP8(offload=False) | 31.5GB | 17.4s | 11.2s | 7.8s |

The model is tested on one H20 GPUs.

For lower memory usage, enable the quantized and/or offload options in the Model Loader node.

Troubleshooting

If you encounter CUDA out of memory errors, try:
- Enabling the offload option
- Using the FP8 quantized model
- Reducing the image size
- Closing other GPU-intensive applications
If you get errors about missing files:
- Make sure your model paths are correct
- The diffusion model should be in models/diffusion_models
- The VAE should be in models/vae
- The text encoder should be in models/text_encoders/Qwen2.5-VL-7B-Instruct
If you get import errors, ensure all dependencies are installed correctly
If you experience unexpected behavior with TeaCache:
- Try different threshold values
- Enable verbose mode to see which steps are being cached
- Verify that the TeaCache model loader is properly connected to the TeaCache generate node

Acknowledgements

Step1X-Edit for the original model
TeaCache for the acceleration technology
ComfyUI for the extensible UI framework