Nunchaku ComfyUI Node. Nunchaku is the inference that supports SVDQuant. SVDQuant is a new post-training training quantization paradigm for diffusion models, which quantize both the weights and activations of FLUX.1 to 4 bits, achieving 3.5× memory and 8.7× latency reduction on a 16GB laptop 4090 GPU. See more details: https://github.com/mit-han-lab/nunchaku
This repository provides the ComfyUI node for Nunchaku, an efficient inference engine for 4-bit neural networks quantized with SVDQuant. For the quantization library, check out DeepCompressor.
Join our user groups on Slack, Discord and WeChat for discussions—details here. If you have any questions, run into issues, or are interested in contributing, feel free to share your thoughts with us!
Please first install nunchaku
following the instructions in README.md.
You can easily use comfy-cli
to run ComfyUI with Nunchaku:
pip install comfy-cli # Install ComfyUI CLI
comfy install # Install ComfyUI
comfy node registry-install ComfyUI-nunchaku # Install Nunchaku
Install ComfyUI with
git clone https://github.com/comfyanonymous/ComfyUI.git
cd ComfyUI
pip install -r requirements.txt
Install ComfyUI-Manager with the following commands:
cd custom_nodes
git clone https://github.com/ltdrdata/ComfyUI-Manager comfyui-manager
Launch ComfyUI
cd .. # Return to the ComfyUI root directory
python main.py
Open the Manager, search ComfyUI-nunchaku
in the Custom Nodes Manager and then install it.
Set up ComfyUI with the following commands:
git clone https://github.com/comfyanonymous/ComfyUI.git
cd ComfyUI
pip install -r requirements.txt
Clone this repository into the custom_nodes
directory inside ComfyUI:
cd custom_nodes
git clone https://github.com/mit-han-lab/ComfyUI-nunchaku nunchaku_nodes
Set Up ComfyUI and Nunchaku:
Nunchaku workflows can be found at workflows
. To use them, copy the files to user/default/workflows
in the ComfyUI root directory:
cd ComfyUI
# Create the workflows directory if it doesn't exist
mkdir -p user/default/workflows
# Copy workflow configurations
cp custom_nodes/nunchaku_nodes/workflows/* user/default/workflows/
Install any missing nodes (e.g., comfyui-inpainteasy
) by following this tutorial.
Download Required Models: Follow this tutorial to download the necessary models into the appropriate directories. Alternatively, use the following commands:
huggingface-cli download comfyanonymous/flux_text_encoders clip_l.safetensors --local-dir models/text_encoders
huggingface-cli download comfyanonymous/flux_text_encoders t5xxl_fp16.safetensors --local-dir models/text_encoders
huggingface-cli download black-forest-labs/FLUX.1-schnell ae.safetensors --local-dir models/vae
Run ComfyUI: To start ComfyUI, navigate to its root directory and run python main.py
. If you are using comfy-cli
, simply run comfy launch
.
Select the Nunchaku Workflow: Choose one of the Nunchaku workflows (workflows that start with nunchaku-
) to get started. For the flux.1-fill
workflow, you can use the built-in MaskEditor tool to apply a mask over an image.
All the 4-bit models are available at our HuggingFace or ModelScope collection. Except svdq-flux.1-t5
, please download the entire model folder to models/diffusion_models
.
Note: We've renamed our nodes from 'SVDQuant XXX Loader' to 'Nunchaku XXX Loader'. Please update your workflows accordingly.
Nunchaku Flux DiT Loader: A node for loading the FLUX diffusion model.
model_path
: Specifies the model's location. You need to manually download the model folder from our Hugging Face or ModelScope collection. For example, run
huggingface-cli download mit-han-lab/svdq-int4-flux.1-dev --local-dir models/diffusion_models/svdq-int4-flux.1-dev
After downloading, set model_path
to the corresponding folder name.
Note: If you rename the model folder, ensure that comfy_config.json
is present in the folder. You can find this file in our corresponding repositories on Hugging Face or ModelScope.
cache_threshold
: Controls the First-Block Cache tolerance, similar to residual_diff_threshold
in WaveSpeed. Increasing this value improves speed but may reduce quality. A typical value is 0.12. Setting it to 0 disables the effect.
attention
: Defines the attention implementation method. You can choose between flash-attention2
or nunchaku-fp16
. Our nunchaku-fp16
is approximately 1.2× faster than flash-attention2
without compromising precision. For Turing GPUs (20-series), where flash-attention2
is unsupported, you must use nunchaku-fp16
.
cpu_offload
: Enables CPU offloading for the transformer model. While this reduces GPU memory usage, it may slow down inference.
auto
, it will automatically detect your available GPU memory. If your GPU has more than 14GiB of memory, offloading will be disabled. Otherwise, it will be enabled.device_id
: Indicates the GPU ID for running the model.
data_type
: Defines the data type for the dequantized tensors. Turing GPUs (20-series) do not support bfloat16
and can only use float16
.
i2f_mode
: For Turing (20-series) GPUs, this option controls the GEMM implementation mode. enabled
and always
modes exhibit minor differences. This option is ignored on other GPU architectures.
Nunchaku FLUX LoRA Loader: A node for loading LoRA modules for SVDQuant FLUX models.
models/loras
directory. These will appear as selectable options under lora_name
.lora_strength
: Controls the strength of the LoRA module.Nunchaku Text Encoder Loader: A node for loading the text encoders.
For FLUX, use the following files:
text_encoder1
: t5xxl_fp16.safetensors
(or FP8/GGUF versions of T5 encoders).text_encoder2
: clip_l.safetensors
t5_min_length
: Sets the minimum sequence length for T5 text embeddings. The default in DualCLIPLoader
is hardcoded to 256, but for better image quality, use 512 here.
use_4bit_t5
: Specifies whether you need to use our quantized 4-bit T5 to save GPU memory.
int4_model
: Specifies the INT4 T5 location. This option is only used when use_4bit_t5
is enabled. You can download our INT4 T5 model folder to models/text_encoders
from HuggingFace or ModelScope. For example, you can run the following command:
huggingface-cli download mit-han-lab/svdq-flux.1-t5 --local-dir models/text_encoders/svdq-flux.1-t5
After downloading, specify the corresponding folder name as the int4_model
.
FLUX.1 Depth Preprocessor (deprecated) : A legacy node for loading a depth estimation model and producing a corresponding depth map. The model_path
parameter specifies the location of the model checkpoint. You can manually download the model repository from Hugging Face and place it under the models/checkpoints
directory. Alternatively, use the following CLI command:
huggingface-cli download LiheYoung/depth-anything-large-hf --local-dir models/checkpoints/depth-anything-large-hf
Note: This node is deprecated and will be removed in a future release. Please use the updated "Depth Anything" node with the depth_anything_vitl14.pth
model file instead.