ComfyUI Extension: Reference-Based Video Colorization
Dual implementation of reference-based video colorization featuring ColorMNet (2024) with DINOv2 and Deep Exemplar (2019). Includes 4 nodes (2 video, 2 image), multiple feature encoders (VGG19, DINOv2, CLIP), advanced post-processing (color-matcher, WLS, guided, bilateral), and auto-installer for dependencies.
Custom Nodes (0)
README
ComfyUI Reference-Based Video Colorization
<p align="center"> <img src="assets/Header Screenshot.png" alt="Reference-Based Video Colorization Workflow" width="100%"/> </p>A comprehensive ComfyUI implementation featuring two state-of-the-art reference-based video colorization methods:
- šØ ColorMNet (2024) - Modern memory-based approach with DINOv2 features
- š¬ Deep Exemplar (2019) - Classic CVPR method with temporal propagation
Transform black & white videos and images into vibrant color using reference images!
š¬ Demo
See ColorMNet in action colorizing classic black & white footage:
<video width="100%" controls> <source src="assets/ColorizationSample.mp4" type="video/mp4"> Your browser does not support the video tag. <a href="assets/ColorizationSample.mp4">Download the video</a> instead. </video>Example: Colorized using ColorMNet with color matching post-processing. The model successfully transfers colors from a reference image while maintaining temporal consistency.
⨠Features
ColorMNet Nodes (New)
- ColorMNet Video Colorization - Memory-based temporal coherent colorization
- ColorMNet Image Colorization - Single image colorization
- DINOv2-based feature extraction for superior quality
- Multiple memory modes (balanced, low memory, high quality)
- FP16 support for faster processing
- torch.compile optimization for 15-25% speedup (PyTorch 2.0+)
- Performance reports with timing and FPS metrics
Deep Exemplar Nodes (Original)
- Deep Exemplar Video Colorization - Frame propagation for temporal consistency
- Deep Exemplar Image Colorization - Classic exemplar-based method
- WLS (Weighted Least Squares) filtering for smoother results
- Configurable lambda and sigma parameters
- torch.compile optimization for 15-25% speedup (PyTorch 2.0+)
- SageAttention for 20-30% faster attention operations
- Performance reports for benchmarking
Common Features
- ā Automatic model download - No manual setup required!
- ā Progress bars - Real-time processing feedback in ComfyUI
- ā Performance metrics - Compare speed and quality between methods
- ā Flexible resolution - Process at any resolution
- ā ComfyUI native - Fully integrated workflow support
š¦ Installation
Method 1: ComfyUI Manager (Recommended)
- Install ComfyUI Manager
- Open ComfyUI ā Manager ā Install Custom Nodes
- Search for "Deep Exemplar Video Colorization"
- Click Install
- Restart ComfyUI
Models will download automatically on first use (~700MB total).
Method 2: Manual Installation
cd ComfyUI/custom_nodes/
git clone https://github.com/jonstreeter/ComfyUI-Reference-Based-Video-Colorization.git
cd ComfyUI-Reference-Based-Video-Colorization/
pip install -r requirements.txt
Restart ComfyUI. Models download automatically on first use.
Recent changes
- Fixed ColorMNet pipeline normalization to match training (L-only input to encoder, normalized ab masks).
- Added auto-install for git-based CUDA deps (py-thin-plate-spline, correlation sampler) on node load.
- New sample:
Workflows/ColorMNet_Image_Workflow.jsonfor single-image colorization.
ColorMNet git-based dependencies
ColorMNet uses two CUDA extensions that are shipped as git repos, not PyPI packages. The node now installs them automatically when it loads, so no manual steps are required. If your environment blocks installs and you need to do it yourself, the commands are:
pip install git+https://github.com/cheind/py-thin-plate-spline.git
pip install git+https://github.com/ClementPinard/Pytorch-Correlation-extension.git
If compilation fails on Windows, install the Desktop development with C++ workload in Visual Studio Build Tools and ensure CUDA_HOME points to your CUDA Toolkit path. The nodes will still run without these extensions, just a bit slower.
š Quick Start
Example Workflow
Load the example workflow from Workflows/Colorize_Video_Workflow.json:
- Load Video - Use VHS Video Loader to load your grayscale video
- Load Reference - Load a color reference image
- Choose Method - Try both ColorMNet and Deep Exemplar
- Compare Results - Use the performance reports to benchmark
- Save Output - Export colorized video with VHS Video Combine
For single-frame work, use the lightweight image workflow at Workflows/ColorMNet_Image_Workflow.json. It loads a target image, loads a reference image, runs the ColorMNetImage node, and saves the result.
Nodes Overview
ColorMNet Nodes
ColorMNet Video (ColorMNet/Video)
video_frames- Input grayscale video framesreference_image- Color reference imagetarget_width/height- Output resolutionmemory_mode- balanced | low_memory | high_qualityuse_fp16- Enable FP16 for speed (default: True)use_torch_compile- Enable PyTorch 2.0+ compilation (default: False)- Outputs: colorized_frames, performance_report
ColorMNet Image (ColorMNet/Image)
image- Input grayscale imagereference_image- Color reference imagetarget_width/height- Output resolutionuse_fp16- Enable FP16 (default: True)use_torch_compile- Enable PyTorch 2.0+ compilation (default: False)- Outputs: colorized_image, performance_report
Deep Exemplar Nodes
Deep Exemplar Video (DeepExemplar/Video)
video_frames- Input grayscale video framesreference_image- Color reference imagetarget_width/height- Output resolutionframe_propagate- Use temporal consistency (default: True)use_half_resolution- Process at half resolution (default: True)use_torch_compile- Enable PyTorch 2.0+ compilation (default: False)use_sage_attention- Enable SageAttention optimization (default: False)- Outputs: colorized_frames, performance_report
Deep Exemplar Image (DeepExemplar/Image)
image_to_colorize- Input grayscale imagereference_image- Color reference imagetarget_width/height- Output resolutionuse_torch_compile- Enable PyTorch 2.0+ compilation (default: False)use_sage_attention- Enable SageAttention optimization (default: False)- Outputs: colorized_image, performance_report
š Performance Reports
All nodes output optional performance reports with timing metrics:
Example ColorMNet Video Report
ColorMNet Video Colorization Report
==================================================
Frames Processed: 240
Resolution: 768x432
Total Time: 45.23 seconds
Average FPS: 5.31
Time per Frame: 0.188 seconds
Memory Mode: balanced
FP16 Enabled: True
Torch Compile: False
==================================================
Example Deep Exemplar Video Report
Deep Exemplar Video Colorization Report
==================================================
Frames Processed: 240
Resolution: 768x432
Total Time: 52.34 seconds
Average FPS: 4.59
Time per Frame: 0.218 seconds
Frame Propagation: Enabled
Half Resolution: Enabled
WLS Filter: Enabled
Lambda: 500.0
Sigma Color: 4.0
Torch Compile: False
SageAttention: False
==================================================
Connect the performance_report output to a text display node or save to file for benchmarking!
š” Tips & Best Practices
Reference Image Selection
- Choose references semantically similar to your content
- Match the color palette you want to achieve
- Higher quality references = better results
- Try multiple references to find the best match
Resolution Settings
- ColorMNet: Processes at full target resolution
- Deep Exemplar: Internally uses half resolution by default
- Start with 768x432 for good speed/quality balance
- Increase for final high-quality renders
Memory Management
ColorMNet Memory Modes:
balanced- Good quality, moderate memory (recommended)low_memory- Reduced memory usage, slight quality trade-offhigh_quality- Best quality, higher memory requirements
Deep Exemplar:
- Enable
use_half_resolutionto reduce memory - Disable
frame_propagatefor independent frame processing - Process shorter clips if encountering OOM errors
Quality vs Speed
For Best Quality:
- ColorMNet:
memory_mode=high_quality,use_fp16=False,use_torch_compile=False - Deep Exemplar:
use_half_resolution=False,use_torch_compile=False,use_sage_attention=False
For Best Speed:
- ColorMNet:
memory_mode=low_memory,use_fp16=True,use_torch_compile=True - Deep Exemplar:
use_half_resolution=True,use_torch_compile=True,use_sage_attention=True
Optimization Notes:
torch.compilerequires PyTorch 2.0+ and provides 15-25% speedup after initial warmupuse_sage_attention(Deep Exemplar only) provides 20-30% faster attention withsageattentioninstalled- Both optimizations maintain identical quality to non-optimized versions
š§ Advanced Configuration
Custom Model Paths
Models are automatically downloaded to:
custom_nodes/ComfyUI-Reference-Based-Video-Colorization/
āāā checkpoints/
ā āāā DINOv2FeatureV6_LocalAtten_s2_154000.pth # ColorMNet (~500MB)
ā āāā video_moredata_l1/
ā āāā nonlocal_net_iter_76000.pth # Deep Exemplar
ā āāā colornet_iter_76000.pth # Deep Exemplar
āāā data/
āāā vgg19_conv.pth # Shared VGG19
āāā vgg19_gray.pth # Deep Exemplar VGG
Optimizations
torch.compile (PyTorch 2.0+): Provides 15-25% speedup through graph compilation and optimization.
- Available for all 4 nodes (ColorMNet + Deep Exemplar)
- Enable via
use_torch_compile=Trueparameter - First run includes warmup compilation (slower), subsequent runs benefit from speedup
- No additional installation required if using PyTorch 2.0+
SageAttention (Deep Exemplar only): INT8-quantized attention for 20-30% faster attention operations.
Installation:
pip install sageattention
Requirements:
- CUDA-capable GPU
- PyTorch with CUDA support
- Enable via
use_sage_attention=Trueparameter - Automatically falls back to standard attention if unavailable
CUDA Correlation Sampler (Optional for ColorMNet): ColorMNet can use optimized CUDA correlation operations if available.
Requirements:
- CUDA Toolkit installed
- Visual Studio Build Tools (Windows)
- Will be attempted automatically on first run
OpenCV Contrib (Required for WLS):
pip install opencv-contrib-python
š Documentation
- Architecture - Technical implementation details
- Performance - Benchmarks and optimization guide
- Quick Start - Detailed getting started guide
- Migration Guide - Upgrading from older versions
š Citation
If you use these methods in your research, please cite the original papers:
ColorMNet
@article{yang2024colormnet,
title={ColorMNet: A Memory-based Deep Spatial-Temporal Feature Propagation Network for Video Colorization},
author={Yang, Yixin and Zhou, Xiaoyu and Liu, Chao and Chen, Junchi and Wang, Zhiwen},
journal={arXiv preprint arXiv:2404.06251},
year={2024}
}
Deep Exemplar
@inproceedings{zhang2019deep,
title={Deep exemplar-based video colorization},
author={Zhang, Bo and He, Mingming and Liao, Jing and Sander, Pedro V and Yuan, Lu and Bermak, Amine and Chen, Dong},
booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
pages={8052--8061},
year={2019}
}
š Related Projects
- Original ColorMNet: https://github.com/yyang181/colormnet
- Original Deep Exemplar: https://github.com/zhangmozhe/Deep-Exemplar-based-Video-Colorization
- ComfyUI: https://github.com/comfyanonymous/ComfyUI
- Bringing Old Photos Back to Life: https://github.com/microsoft/Bringing-Old-Photos-Back-to-Life
š License
This ComfyUI implementation is licensed under the MIT License.
Note on Model Licenses:
- ColorMNet model: CC BY-NC-SA 4.0 (Non-commercial use only)
- Deep Exemplar model: Subject to original project license
š Acknowledgments
- ColorMNet implementation based on yyang181/colormnet
- Deep Exemplar implementation based on zhangmozhe/Deep-Exemplar-based-Video-Colorization
- Thanks to the ComfyUI community for feedback and testing
š Issues & Support
Found a bug or have a feature request? Please open an issue on GitHub!
For general questions:
- Check the documentation
- Review existing issues
- Open a new issue with details
šÆ Roadmap
- [ ] Batch processing optimization
- [ ] Memory-efficient streaming for long videos
- [ ] Additional colorization methods
- [ ] Color transfer utilities
- [ ] Integration with other ComfyUI nodes
Star ā this repo if you find it useful!