ComfyUI Extension: ComfyUI-FlashVSR_Ultra_Fast

Authored by lihaoyun6

Created 2 months ago

Updated 20 days ago

566 stars

Running FlashVSR on lower VRAM without any artifacts.

Custom Nodes (0)

README

ComfyUI-FlashVSR_Ultra_Fast

Running FlashVSR on lower VRAM without any artifacts.
[📃中文版本]

Changelog

2025-10-24

Added long video pipeline that significantly reduces VRAM usage when upscaling long videos.

2025-10-21

Initial this project, introducing features such as tile_dit to significantly reducing VRAM usage.

2025-10-22

Replaced Block-Sparse-Attention with Sparse_Sage, removing the need to compile any custom kernels.
Added support for running on RTX 50 series GPUs.

Preview

Usage

mode:
tiny -> faster (default); full -> higher quality
scale:
4 is always better, unless you are low on VRAM then use 2
color_fix:
Use wavelet transform to correct the color of output video.
tiled_vae:
Set to True for lower VRAM consumption during decoding at the cost of speed.
tiled_dit:
Significantly reduces VRAM usage at the cost of speed.
tile_size, tile_overlap:
How to split the input video.
unload_dit:
Unload DiT before decoding to reduce VRAM peak at the cost of speed.

Installation

nodes:

cd ComfyUI/custom_nodes
git clone https://github.com/lihaoyun6/ComfyUI-FlashVSR_Ultra_Fast.git
python -m pip install -r ComfyUI-FlashVSR_Ultra_Fast/requirements.txt

📢: For Turing or older GPU, please install triton<3.3.0:

# Windows
python -m pip install -U triton-windows<3.3.0
# Linux
python -m pip install -U triton<3.3.0

models:

Download the entire FlashVSR folder with all the files inside it from here and put it in the ComfyUI/models

├── ComfyUI/models/FlashVSR
|     ├── LQ_proj_in.ckpt
|     ├── TCDecoder.ckpt
|     ├── diffusion_pytorch_model_streaming_dmd.safetensors
|     ├── Wan2.1_VAE.pth

Acknowledgments

FlashVSR @OpenImagingLab
Sparse_SageAttention @jt-zhang
ComfyUI @comfyanonymous