ComfyUI Extension: ComfyUI Qwen Rectified Flow Tools
This custom node pack for ComfyUI provides a suite of advanced tools for performing 'invert-and-resample' workflows with Rectified Flow models like Qwen-VL. It allows you to take an existing image, convert it to a noisy latent using a guided inversion process, and then use that latent as a starting point to generate creative, high-quality variations.
Custom Nodes (0)
README
ComfyUI Flow Matching Inverter Nodes
Custom nodes for ComfyUI that target rectified-flow models (for example, Qwen). The nodes cover three core tasks:
- integrate the model's velocity field to produce a partially inverted latent,
- add noise using the same schedule as the sampler, and
- perform small latent-domain cleanups or conditioning perturbations.
Each node is available on its own, and the hybrid node chains them together for a single-drop workflow.
This node pack also includes a variety of latent and conditioning space noise nodes enabling you to do really interesting and insane things in the latent and conditioning space.
Conditioning primer
ComfyUI passes prompt information around as a Python list of [embedding, metadata]
pairs. The embedding
tensor stores the per-token hidden states produced by the text encoder (shape ≈ batch × tokens × features
, or tokens × features
when the batch dimension is implicit). The accompanying metadata
dictionary carries auxiliary fields—most notably pooled_output
, a single vector that summarises the whole prompt (CLS token, sentence average, etc.).
Whenever you tweak conditioning with the nodes below you are directly manipulating those tensors. Splitting, blurring, or scaling happens along the token axis, while pooled outputs are updated in lockstep so downstream samplers still receive a coherent summary.
Node overview
Qwen Rectified Flow Inverter
Runs the rectified-flow ODE backwards to re-noise an encoded latent. Supports deterministic amplification, seeded stochastic perturbation, optional output normalisation, and shares the same interface as the sampler (steps
, conditioning
, etc.).
Latent Gaussian Blur
Applies a Gaussian blur directly in latent space. Spatial Only
blurs each channel independently; Spatial and Channel
performs a joint 3D blur across channels for stronger smoothing.
Latent Frequency Split
Separates a latent into low-pass structure and high-frequency detail bands using a Gaussian crossover so you can process each side differently before recombining.
Add Latent Noise (Seeded)
Adds seeded Gaussian noise scaled by the input latent's standard deviation. Useful whenever a latent needs a controlled bump in noise without touching the schedule.
Latent Perlin Fractal Noise
Generates smooth octave-based Perlin noise and adds it to the latent. Tune the base frequency, octaves, persistence, and lacunarity to decide whether you want broad undulations or fine stippling, and choose between shared or per-channel fields for subtle colour-channel offsets.
Latent Simplex Noise
Produces layered simplex noise—Perlin's successor—for isotropic, artifact-free texture. Frequency, octaves, persistence, and lacunarity work just like the Perlin node, while temporal_mode
lets you keep the same pattern across video frames (locked
) or reseed each frame (animated
).
Latent Worley Noise
Synthesises cellular Worley (Voronoi) patterns by scattering seeded feature points. Adjust the point count, distance metric, and jitter to morph between cracked clay, reptilian scales, or bubbly foams. Combine with animated mode to introduce evolving cellular motion in video latents.
Latent Reaction-Diffusion
Runs a Gray–Scott reaction-diffusion simulation in latent space to generate self-organising Turing patterns. Feed, kill, and diffusion controls expose the classic Gray–Scott parameter space; temporal_mode
toggles between a single locked pattern and re-simulating per frame.
Latent Fractal Brownian Motion
Wraps any of the base noise primitives (simplex, Perlin, or Worley) in an fBm stack to build rich multi-scale detail. Combine frequency, octaves, persistence, and lacunarity to decide how aggressively the layers accumulate, and pick Worley for cellular fBm, Perlin for smooth ridges, or simplex for organic grain.
Under the hood the node keeps resampling the chosen base noise at higher frequency while shrinking the amplitude (classic fBm). Each octave is normalized before it is added, so persistence becomes a clean gain control instead of inheriting whatever variance the base noise happens to produce. Worley uses the feature_points
count scaled by frequency to keep the cell density consistent as lacunarity climbs. Once all octaves are summed, the composite field is normalized again, multiplied by the latent's standard deviation, and finally scaled by strength
before being added back. channel_mode
decides whether every channel shares the same Brownian field or gets an independent reseeded stack, and temporal_mode
controls whether video batches reuse the field (locked
) or generate fresh motion on every frame (animated
).
Latent Swirl Noise
Warps the latent around one or more seeded vortices using grid_sample
. Choose between a global grid (shared across channels) or per-channel grids, set the fraction of channels to affect, and control vortex count, rotation, falloff radius, centre jitter, and blend to inject painterly whirlpools or gentle twisting motion where you want it.
Conditioning (Add Noise)
Adds seeded Gaussian noise to the CLIP conditioning embeddings (and pooled output when present). Great for introducing gentle prompt variation without rewriting text.
Sensible values of strength
range from 0.0 to 1.5. Above this range, the Qwen model's output tends to get silly, with Chinese characters overlaid on the image.
Conditioning (Gaussian Blur)
Smooths the token embeddings with a Gaussian kernel along the prompt sequence, softening sharp emphasis changes while preserving the overall prompt content.
Conditioning (Frequency Split)
Generates low- and high-frequency prompt embeddings so you can tame the overall narrative while separately shaping high-energy emphasis tokens.
Sensible values of sigma
range from 0.1 to 1.3. The higher you raise sigma
, the more of the conditioning information that will be shoved into the low_pass
conditioning output. Values well above 1.0 are really equivalent to knocking out the high pass signal entirely. But strange and interesting results can be obtained by using a very high sigma in combination with a very high value of gain
for the high_pass
conditioning in the Conditioning (Frequency Merge)
node.
Conditioning (Frequency Merge)
Recombines low/high conditioning bands with adjustable gains so you can dial detail back in after sculpting each band independently. Boosting the low band reinforces the prompt's broad narrative and stability, while boosting the high band pushes sharp emphasis changes and punctuation to the forefront.
Feed the output of the Conditioning (Frequency Split)
node into the input of this node and then play around with the gain knobs for interesting effects. Or peel off the high or low-pass conditioning and then add noise to one or the other, or combine the high pass conditionings of two different CLIP Text Encode
outputs... the combinations are endless.
Conditioning (Scale)
Multiplies the conditioning embeddings and pooled outputs by a user-defined factor so you can mute (0.0), keep (1.0), or amplify (>1.0) prompt influence without editing text.
Forward Diffusion (Add Scheduled Noise)
Uses the model's noise schedule (via KSampler
) to add the amount of noise that corresponds to a given progress value. This is the "anchor" latent that samplers expect when starting from a given denoise level.
Latent Hybrid Inverter (Qwen)
Convenience node that calls the inverter and forward diffusion nodes internally, then blends their outputs using spherical interpolation. It keeps separate seeds for the creative and anchor paths and exposes the blend weight so you can bias towards either latent.
Creative play
Image exploration rarely means chasing a single "correct" answer. These nodes are designed to be mixed and layered so you can steer a latent in different, often surprising, directions:
- Lead with
Latent Hybrid Inverter
to recover structure, then nudge that structure by addingConditioning (Add Noise)
at a subtle strength (≈0.05–0.15). The latent stays coherent while the prompt wanders just enough to spark new ideas. - Follow the noise with
Conditioning (Gaussian Blur)
to soften abrupt emphasis changes or to merge multi-prompt blends into a single vibe. Blurring after noise often produces dreamlike, painterly shifts instead of chaotic drift. - Use
Conditioning (Scale)
as a volume knob while iterating: dial the factor down to 0.3–0.5 when you want the image to respond mostly to the recovered latent, then crank it above 1.5 when the textual guidance should take the lead. - Pair
Latent Gaussian Blur
withConditioning (Gaussian Blur)
for holistic smoothing—latent blur calms texture while conditioning blur calms prompt pacing. Reintroduce energy by sprinkling inAdd Latent Noise
or ramping the conditioning scale afterwards. - Sculpt structured detail by layering
Latent Perlin Fractal Noise
(shared mode for broad waves, per-channel for colour shifts) before blending back withLatent Swirl Noise
—increasevortices
for multi-spiral motion, switch to per-channel grids, or dropchannel_fraction
to swirl only a handful of feature maps. - Split a latent with
Latent Frequency Split
, sharpen or noise the high band while blurring the low band, then recombine viaLatent Mixer
/Add
nodes to weave sharp detail onto soft composition. - Carve prompts with
Conditioning (Frequency Split)
and return them throughConditioning (Frequency Merge)
: keep low-band text steady for the story while remapping the high band through noise, blur, or scale to emphasize only certain fragments. - Swap seeds between the inverter, forward diffusion, and noise nodes as you iterate. Small tweaks here can shift the interplay between latent detail and prompt guidance, rewarding playful experimentation.
- Chase naturalistic textures by stacking
Latent Simplex Noise
or the fBm node before sampling; switchtemporal_mode
toanimated
when generating video to keep organic flicker under control. For harder cellular structures, blend inLatent Worley Noise
, and when you need life-like stripes or spots, seedLatent Reaction-Diffusion
with slightly different feed/kill rates to explore the Gray–Scott pattern atlas.
Treat each slider as a brushstroke: push a setting until it breaks, back off to the sweet spot, and capture the happy accidents along the way.
Example workflows
Manual composition
- Encode an image to a latent.
- Feed the latent into
Qwen Rectified Flow Inverter
(pick strength/amplification/perturbation). - Feed the clean latent into
Forward Diffusion
with the same strength but a different seed. - Optionally clean either latent with
Latent Gaussian Blur
orAdd Latent Noise
. - Blend the creative and anchor latents (e.g.
Latent Slerp
) and sample withdenoise
equal to the chosen strength.
Hybrid node quickstart
- Encode an image to a latent.
- Run the latent through
Latent Hybrid Inverter
with your preferred strength and blend factor. - Sample the result with
denoise
matching the passed strength.
Combination example
Check Examples/Combination Example.png
for a full workflow that combines the inverter, simplex and Worley noise, reaction-diffusion, and fBm nodes. The diagram illustrates how to stage noise injection around the hybrid inverter and how the new temporal_mode
/channel_mode
options keep video latents coherent while still layering rich structure.
Installation
- Clone or download this repository into
ComfyUI/custom_nodes/
. - Restart ComfyUI to load the nodes (look under
Qwen/Sampling
andLatent
).
Running the tests
The repository ships with a helper script that prepares a virtual environment and executes the pytest suite. From the project root run:
./run_tests.sh
The script creates (or reuses) a .venv
folder, installs the dependencies
from requirements.txt
, and then launches pytest
. Pass additional arguments
to forward them to pytest—for example ./run_tests.sh -k noise
runs only the
tests whose names include noise
.