ComfyUI Extension: CUI-Lumina2-TeaCache
comfy extension for lumina2 TeaCache
Custom Nodes (0)
README
cui-teacache-lu2
referenced from https://github.com/spawner1145/TeaCache/tree/main/TeaCache4Lumina2
firstly transplanted by @fexli
retransplanted by @spawner
Installation
Manual installation
// switch to your project's root directory
cd custom_nodes
git clone https://github.com/spawner1145/CUI-Lumina2-TeaCache.git
Installation via comfyui-manager
- Open ComfyUI WebUI
- Navigate to
Manager
->Install Custom Node
- Enter
CUI-Lumina2-TeaCache
in theSearch
field, and clickSearch
- Click
Install
usage
- Connect the
TeaCache
node between theUNet Loader
andKSampler
in your workflow. - Set the
rel_l1_thresh
parameter to a value greater than 0. - to work on low steps, you can set the value below to
[393.76566581, -603.50993606, 209.10239044, -23.00726601, 0.86377344]
and a smallrel_l1_thresh
like 0.3 for higher speed or set the value below to[225.7042019806413, -608.8453716535591, 304.1869942338369, 124.21267720116742, -1.4089066892956552]
and a very largerel_l1_thresh
like 5 for higher speed and better quality, and for higher steps, you can set the value below to[225.7042019806413, -608.8453716535591, 304.1869942338369, 124.21267720116742, -1.4089066892956552]
and arel_l1_thresh
<1.1 to get better quality and higher speed. - The nodes are configured with different parameters. When using 25 steps or fewer, it is recommended to set the l1 value to approximately 6. For larger step sizes, the l1 should be decreased proportionally. For instance, a value of 0.6 is suggested for 50 steps.(
[225.7042019806413, -608.8453716535591, 304.1869942338369, 124.21267720116742, -1.4089066892956552]
)
Note:
- Higher
rel_l1_thresh
values will improve generation efficiency (manifested as shorter generation times), at the cost of reduced image quality. - The optimal value should be determined through empirical testing based on your specific quality/efficiency requirements.
reference
TeaCache can speedup Lumina-Image-2.0 without much visual quality degradation, in a training-free manner. The following image shows the results generated by TeaCache-Lumina-Image-2.0 with various rel_l1_thresh values: 0 (original), 0.2 (1.25x speedup), 0.3 (1.5625x speedup), 0.4 (2.0833x speedup), 0.5 (2.5x speedup).
<p align="center"> <img src="https://github.com/user-attachments/assets/d2c87b99-e4ac-4407-809a-caf9750f41ef" width="150" style="margin: 5px;"> <img src="https://github.com/user-attachments/assets/411ff763-9c31-438d-8a9b-3ec5c88f6c27" width="150" style="margin: 5px;"> <img src="https://github.com/user-attachments/assets/e57dfb60-a07f-4e17-837e-e46a69d8b9c0" width="150" style="margin: 5px;"> <img src="https://github.com/user-attachments/assets/6e3184fe-e31a-452c-a447-48d4b74fcc10" width="150" style="margin: 5px;"> <img src="https://github.com/user-attachments/assets/d6a52c4c-bd22-45c0-9f40-00a2daa85fc8" width="150" style="margin: 5px;"> </p>📈 Inference Latency Comparisons on a single 4090 (step 50)
| Lumina-Image-2.0 | TeaCache (0.2) | TeaCache (0.3) | TeaCache (0.4) | TeaCache (0.5) | |:-------------------------:|:---------------------------:|:--------------------:|:---------------------:|:---------------------:| | ~25 s | ~20 s | ~16 s | ~12 s | ~10 s |
special thanks
fexli
The original TeaCache transplant of Lumina2 in cui
welltop-cn
model patch code design