ComfyUI Extension: sd-perturbed-attention
Perturbed-Attention Guidance (PAG), Smoothed Energy Guidance (SEG), Sliding Window Guidance (SWG), PLADIS, Normalized Attention Guidance (NAG), Token Perturbation Guidance (TPG) for ComfyUI and SD reForge.
Custom Nodes (9)
README
Various Guidance implementations for ComfyUI / SD WebUI (reForge)
Implementation of
- Perturbed-Attention Guidance from Self-Rectifying Diffusion Sampling with Perturbed-Attention Guidance (D. Ahn et al.)
- Smoothed Energy Guidance: Guiding Diffusion Models with Reduced Energy Curvature of Attention (Susung Hong)
- Sliding Window Guidance from The Unreasonable Effectiveness of Guidance for Diffusion Models (Kaiser et al.)
- PLADIS: Pushing the Limits of Attention in Diffusion Models at Inference Time by Leveraging Sparsity (ComfyUI-only)
- Normalized Attention Guidance: Universal Negative Guidance for Diffusion Models (ComfyUI-only, has a description inside ComfyUI)
- Token Perturbation Guidance for Diffusion Models (ComfyUI-only)
as an extension for ComfyUI and SD WebUI (reForge).
Works with SD1.5 and SDXL.
Installation
ComfyUI
You can either:
-
git clone https://github.com/pamparamm/sd-perturbed-attention.git
intoComfyUI/custom-nodes/
folder. -
Install it via ComfyUI Manager (search for custom node named "Perturbed-Attention Guidance").
-
Install it via comfy-cli with
comfy node registry-install sd-perturbed-attention
SD WebUI (reForge)
git clone https://github.com/pamparamm/sd-perturbed-attention.git
into stable-diffusion-webui-forge/extensions/
folder.
SD WebUI (Auto1111)
As an alternative for A1111 WebUI you can use PAG implementation from sd-webui-incantations extension.
Guidance Nodes/Scripts
ComfyUI
SD WebUI (reForge)
[!NOTE] You can override
CFG Scale
andPAG Scale
/SEG Scale
for Hires. fix by opening/enablingOverride for Hires. fix
tab. To disable PAG during Hires. fix, you can setPAG Scale
under Override to 0.
Inputs
scale
: Guidance scale, higher values can both increase structural coherence of an image and oversaturate/fry it entirely.adaptive_scale
(PAG only): PAG dampening factor, it penalizes PAG during late denoising stages, resulting in overall speedup: 0.0 means no penalty and 1.0 completely removes PAG.blur_sigma
(SEG only): Normal deviation of Gaussian blur, higher values increase "clarity" of an image. Negative values setblur_sigma
to infinity.unet_block
: Part of U-Net to which Guidance is applied, original paper suggests to usemiddle
.unet_block_id
: Id of U-Net layer in a selected block to which Guidance is applied. Guidance can be applied only to layers containing Self-attention blocks.sigma_start
/sigma_end
: Guidance will be active only betweensigma_start
andsigma_end
. Set both values to negative to disable this feature.rescale
: Acts similar to RescaleCFG node - it prevents over-exposure on highscale
values. Based on Algorithm 2 from Common Diffusion Noise Schedules and Sample Steps are Flawed (Lin et al.). Set to 0 to disable this feature.rescale_mode
:full
- takes into account both CFG and Guidance.partial
- depends only on Guidance.snf
- Saliency-adaptive Noise Fusion from High-fidelity Person-centric Subject-to-Image Synthesis (Wang et al.). Should increase image quality on high guidance scales. Ignoresrescale
value.
unet_block_list
: Optional input, replaces bothunet_block
andunet_block_id
and allows you to select multiple U-Net layers separated with commas. SDXL U-Net has multiple indices for layers, you can specify them by using dot symbol (if not specified, Guidance will be applied to the whole layer). Example value:m0,u0.4
(it applies Guidance to middle block 0 and to output block 0 with index 4)- In terms of U-Net
d
meansinput
,m
meansmiddle
andu
meansoutput
. - SD1.5 U-Net has layers
d0
-d5
,m0
,u0
-u8
. - SDXL U-Net has layers
d0
-d3
,m0
,u0
-u5
. In addition, each block exceptd0
andd1
has0-9
index values (likem0.7
oru0.4
).d0
andd1
have0-1
index values. - Supports block ranges (
d0-d3
corresponds tod0,d1,d2,d3
) and index value ranges (d2.2-9
corresponds to all index values ofd2
with the exclusion ofd2.0
andd2.1
).
- In terms of U-Net
ComfyUI TensorRT PAG (Experimental)
To use PAG together with ComfyUI_TensorRT, you'll need to:
- Have 24GB of VRAM.
- Build static/dynamic TRT engine of a desired model.
- Build static/dynamic TRT engine of the same model with the same TRT parameters, but with fixed PAG injection in selected UNET blocks (
TensorRT Attach PAG
node). - Use
TensorRT Perturbed-Attention Guidance
node with two model inputs: one for base engine and one for PAG engine.