Accelerate FLUX inferencing speed for ComfyUI.
This repository integrates all the tricks I know to speed up Flux inference:
TeaCache
or FBCache
or MBCache
or ToCa
;SageAttention
or SpargeAttn
;AttributeError: 'SymInt' object has no attribute 'size'
to speed up recompilation after resolution changing.MBCache
extends FBCache
and is applied to cache multiple blocks. The codes are modified from SageAttention, ComfyUI-TeaCache, comfyui-flux-accelerator and Comfy-WaveSpeed. More details see above given repositories.
You can use XXCache
, SageAttention
, and torch.compile
with the following examples:
More specific:
flux1-dev.safetensors
or flux1-schnell.safetensors
file into models/diffusion_models
and the ae.safetensors
file into models/vae
;.safetensors
files into models/clip
;.pth
file into models/diffusion_models
;models/text_encoders
;.safetensors
file into models/vae
;SpargeAttn is an attention acceleration method based on SageAttention, which requires hyperparameter tuning before using. The tuning process is shown in the following steps:
First you should follow the steps below to install SpargeAttn
. If you have problems installing it, see the original repository;
git clone https://github.com/thu-ml/SpargeAttn.git
cd ./SpargeAttn
pip install -e .
If you do not have a hyperparameter file, you should perform a few rounds of quality fine-tuning to get one first. You just need to open the enable_tuning_mode
of the node Apply SpargeAttn
and perform the generation. For example, generate 50-step 512*512 images at 10 different prompts (very time-consuming);
- The
skip_DoubleStreamBlocks
andskip_SingleStreamBlocks
arguments are used to skip certain blocks that do not require the use ofSpargeAttn
, mainly to work withTeaCache
andFBCache
.- Enable
parallel_tuning
to utilize multiple GPUs to accelerate tuning. In this case, you need to start ComfyUI with the argument--disable-cuda-malloc
.- [New] Follow the author's code updates to liberalize the use of the
l1
andpv_l1
parameters for tuning.
Turn off enable_tuning_mode
and use the Save Finetuned SpargeAttn Hyperparams
node to save your hyperparameter file;
Remove or disable the Save Finetuned SpargeAttn Hyperparams
node and place the saved hyperparameter file in the models/checkpoints
folder. Load this hyperparameter file in the Apply SpargeAttn
node;
Enjoy yourself.
To make tuning hyperparameters easier, I've provided an example workflow here. This workflow defaults to generating a 50-step 512*512 image for each of the 10 preset prompts (which can be modified as you see fit). Click on the Queue
button to start tuning. Of course, you need to make sure you have the right environment before you start. Again, this process is very time consuming.
If you have a well-tuned hyperparameter file, feel free to share it.