ComfyUI Extension: pre_cfg_comfy_nodes_for_ComfyUI

Authored by Extraltodeus

Created about a year ago

Updated 3 months ago

51 stars

A set of nodes to prepare the noise predictions before the CFG function

README

Pre CFG nodes

A set of nodes to prepare the noise predictions before the CFG function

All can be chained and repeated within the same workflow!

They are to be highly compatible with most nodes.

The order matters and depends on your needs.

The best chaining order is therefore to be determined by your own preferences.

All are to be used like any model patching node, right after the model loader.

Nodes:

Other nodes

There are now too many nodes for me to just add a screenshot and a bunch of details but it would be a shame not to describe them:

Perturbed attention guidance: adaptation of PAG as a pre-CFG node.
Variable CFG: Make you scale vary along the generation
channel multipliers
subtract prediction mean: gives more balanced colors
"flip flop": swap the positive with the negative. Since the order matter, you may chain it with other nodes and go back to the correct order after. For experimental purposes.
Shape attention (for SDXL) can turn off the input layer 8.
Support empty uncond: Combined with "menu>advanced>conditioning>set timestep range" at ~65% you can now get a speed boost on any workflow.
Set timestep range from sigmas: same as the default node except that you're using sigmas instead of step percentage
The testing branch has a few more and is the current state of these nodes for me.

Pre CFG automatic scale

mode:

Automatic CFG: applies the same predictable scaling as my other nodes based on this logic
Strict scaling: applies a scaling which will always give the exact desired value. This tends to create artifacts and random blurs if carried through the end.

Support empty uncond:

If you use the already available node named ConditioningSetTimestepRange you can stop generating a negative prediction earlier by letting your negative conditioning go through it while setting it like this:

This speeds up your generation speed by two for the steps where there is no negative.

The only issue if you do this is that the CFG function will weight your positive prediction times your CFG scale against nothing and you will get a black image.

"support_empty_uncond" therefore divides your positive prediction by your CFG scale and avoids this issue.

Doing this combination is similar to the "boost" feature of my original automatic CFG node. It can also let you avoid artifacts if you want to use the strict scaling.

If you want to use this option in a chained setup using this node multiple times I recommand to use it only once and on the last.

Pre CFG perp-neg

Applies the already known perp-neg logic.

Code taken and adapted from ComfyAnon implementation.

The context length (added after the screenshot of the node) can be set to a higher value if you are using a tensor rt engine requiring a higher context length.

For more details you can check my node related to this "Conditioning crop or fill" where I explain a bit more about this.

Pre CFG sharpening (experimental)

Subtract from the current step something from the previous step. This tends to make the images sharper and less saturated.

A negative value can be set.

Pre CFG exponentiation (experimental)

A value lower than one will simplify the end result and enhance the saturation / contrasts.

A value higher than one will do the opposite and if pushed too far will most likely make a mess.

Gradient scaling:

Named like this because I initially wanted to test what would happen if I used, instead of a single CFG scale, a tensor shaped like the latent space with a gradual variation. And then why not try to use masks instead? And what if I could make it so each value will match as closely as possible another input image?

The result is an arithmetic scaling method which does not noticeably slow down the sampling while also scaling the intensity of the values like an "automatic cfg".

So here it is:

So, simply put:

Maximum scale: Which max CFG scale can be used to try to match the input? You can go as high as 500 and still get an output. At 1000 you should stop before the end.
Minimum scale: Same of course but this one I find better to let in between 3.5 and 5.
Strength: An overall multiplier for the effect. Generally left at 1 but if you use a plain color image and feel like your results are too smooth you may want to lower it.
end at sigma: You can go down to the end of the sampling if using the next described toggle but in general I prefer to stop at 0.28. Stopping before the end will give better result with super high scales. 0.28 is the default value.
Converging scales: make the min and max scales join your sampler scale as the sampling goes. This can weaken the pattern matching effect if you are aiming for something precise but otherwise greatly enhance the final result also allow the use of a bigger maximum scale.
invert mask: for convenience

Potential uses:

General light direction/composition influence (all same seed):

combined_image

Vignetting:

combined_v_images

Color influence:

combined_rgb_image

Pattern matching, here with a black and white spiral:

00347UI_00001_

A blue one with a lower scale:

00297UI_00001_

As you can notice the details a pretty well done in general. It seems that using an input latent as a guide also helps with the overall quality. Using a "freshly" encoded latent, I haven't tried to loop back a latent space resulting from sampling directly.

Text is a bit harder to enforce and may require more tweaking with the scales:

00133UI_00001_

Since it takes advantage of the "wiggling room" left by the CFG scale so to make the generation match an image, it can hardly contradict what is being generated.

Here, an example using a black and red spiral, since the base description is about black and white I could only enforce the red by using destructive scales:

combined_side_by_side_image

Side use:

If only using a mask for the input, will apply the selected maximum scale to the target area.
If nothing is connected: will use the positive prediction as guide for 74% of the sigma and the negative for the last part.

Note:

Given that this is a non-ml solution, unlike controlnet, it can not tell the difference in between a banana and a person. It simply tries to make the values match the input image. A giraffe is just an apple with different values at a different place.
It is possible to chain multiple times this node for as long as the sum of all the strength sliders is equal or below one.
I added two image generators. One simply using RGB sliders and a gradient generator which can also make circular patterns while outputting a mask, to make vignetting easy. You will find them in the "image" category.