ComfyUI Extension: Universal NN Latent Upscaler for ComfyUI

Authored by DenRakEiw

Created 3 months ago

Updated 3 months ago

15 stars

A universal neural network latent upscaler that supports SD1.5, SDXL, Flux, and Wan2.2 models. Uses trained neural networks instead of simple interpolation for higher quality latent upscaling. Built upon the excellent foundation of a/Ttl's ComfyUi_NNLatentUpscale - this project extends the original work with universal model support and improved architectures.

Custom Nodes (0)

README

🚀 Universal NN Latent Upscaler for ComfyUI

⚠️ Work in Progress: This project is actively being developed. Current models work but we're training improved versions for better quality!

A universal neural network latent upscaler that supports SD1.5, SDXL, Flux, and Wan2.2 models. Uses trained neural networks instead of simple interpolation for higher quality latent upscaling.

Built upon the excellent foundation of Ttl's ComfyUi_NNLatentUpscale - this project extends the original work with universal model support and improved architectures.

✨ Features

🎯 Universal Support: Works with SD1.5, SDXL, Flux, and Wan2.2 models
🧠 Neural Network Upscaling: Significantly better quality than bilinear interpolation
⚙️ Configurable Scale Factor: 1.0x to 2.0x upscaling with smooth interpolation
🔄 Automatic Model Loading: Automatically detects and loads the correct model for each architecture
💾 Memory Efficient: Smart model loading/unloading to save VRAM
🎨 Custom Wan2.2 Model: Includes our own trained Wan2.2 upscaler (more models coming!)
🔧 Easy Integration: Drop-in replacement for standard latent upscaling

📦 Installation

Method 1: Git Clone (Recommended)

cd ComfyUI/custom_nodes
git clone https://github.com/yourusername/wan_nn_latent.git

Method 2: Manual Download

Download the repository as ZIP
Extract to ComfyUI/custom_nodes/wan_nn_latent/
Restart ComfyUI

Requirements

ComfyUI (latest version recommended)
PyTorch with CUDA support (for GPU acceleration)
Python 3.8+

🎯 Usage

Add the node: Add Node -> latent -> Universal NN Latent Upscale
Connect your latent: Connect any latent tensor as input
Select model type: Choose SD 1.5, SDXL, Flux, or Wan2.2
Set upscale factor: 1.0x to 2.0x (1.5x recommended)
Connect output: Use the upscaled latent in your workflow

Example Workflow

[VAE Encode] -> [Universal NN Latent Upscale] -> [Your Model] -> [VAE Decode]

📊 Current Model Performance

Wan2.2 Results (v1.0 - Current):

MSE: 0.1038 (vs 0.1054 bilinear) - ✅ 1.5% improvement
PSNR: 9.84 dB (vs 9.77 dB bilinear) - ✅ 0.7 dB improvement
SSIM: 0.3247 (vs 0.2690 bilinear) - ✅ 20.7% improvement

🔬 In Development: Training new models with 2,852 real photo samples for significantly better quality!

🔧 Model Files

The following model files are included:

| Model | File | Size | Status | Quality | |-------|------|------|--------|---------| | SD1.5 | sd15_resizer.pt | 12.6MB | ✅ Stable | Good | | SDXL | sdxl_resizer.pt | 12.6MB | ✅ Stable | Good | | Flux | flux_resizer.pt | 25.3MB | ✅ Stable | Good | | Wan2.2 | wan2.2_resizer_best.pt | 3.9MB | ⚠️ WIP | Improving |

📝 Note: Models are automatically downloaded on first use if not present.

⚙️ Technical Details

Scale Factors (Auto-detected):

SD1.5/SDXL: 0.13025 (4-channel latents)
Flux: 0.3611 (16-channel latents)
Wan2.2: 0.3604 (16-channel latents, empirically determined)

Model Architectures:

SD1.5/SDXL: 4 channels → 128 hidden → 4 channels
Flux/Wan2.2: 16 channels → 256 hidden → 16 channels

Neural Network Design:

🧠 Encoder: 3-layer CNN feature extractor with ReLU activation
🔄 Upsampler: Transpose convolution + refinement layers
➕ Skip Connections: Residual learning for detail preservation
📏 Adaptive Scaling: Supports any scale factor between 1.0-2.0x

🔄 Example Workflows

Basic Upscaling:

[VAE Encode] -> [Universal NN Latent Upscale] -> [Model] -> [VAE Decode]

Advanced Pipeline:

[Load Image] -> [VAE Encode] -> [Universal NN Latent Upscale] ->
[ControlNet/LoRA] -> [Model] -> [VAE Decode] -> [Save Image]

🚧 Development Status

Current (v1.0):

✅ Basic functionality working
✅ All model types supported
✅ Stable performance
⚠️ Wan2.2 model needs improvement

In Progress (v2.0):

🔄 Training improved Wan2.2 model with 2,852 real photos
🔄 20,000 training steps with advanced loss functions
🔄 Better detail preservation and artifact reduction
🔄 Performance benchmarking against other methods

Planned (v3.0):

📋 Support for custom model training
📋 Additional model architectures
📋 Batch processing optimization
📋 Advanced configuration options

🛠️ Development & Contributing

For Developers:

Training Your Own Models:

# Generate dataset
python create_real_dataset.py

# Train model (example for Wan2.2)
python slow_long_training_20k.py

# Test model
python test_model_performance.py

Adding New Model Types:

Add configuration to MODEL_CONFIGS in latent_resizer.py
Create specialized resizer class if needed
Add model file path to weight_paths in nn_upscale.py
Update the dropdown in INPUT_TYPES

Contributing:

🐛 Bug reports and feature requests welcome
🔧 Pull requests for improvements appreciated
📊 Share your training results and model improvements
📝 Documentation improvements always helpful

🐛 Troubleshooting

Common Issues:

| Issue | Solution | |-------|----------| | Model not found | Models auto-download on first use. Check internet connection. | | CUDA out of memory | Reduce batch size in workflow or use CPU mode | | Wrong model selected | Ensure latent type matches selected model (check channels) | | Slow performance | Enable GPU acceleration, check CUDA installation | | Quality issues | Try different scale factors, ensure correct model type |

Debug Mode:

# In nn_upscale.py, set:
force_reload = True  # Reloads model on every execution

Getting Help:

📖 Check this README and troubleshooting section
🐛 Open an issue on GitHub
💬 Join the ComfyUI Discord community
📧 Contact: [[email protected]]

📊 Benchmarks & Comparisons

| Method | SSIM ↑ | PSNR ↑ | MSE ↓ | Speed | |--------|--------|--------|-------|-------| | Bilinear | 0.2690 | 9.77 dB | 0.1054 | ⚡⚡⚡ | | Our NN (v1.0) | 0.3247 | 9.84 dB | 0.1038 | ⚡⚡ | | Our NN (v2.0) | Training... | Training... | Training... | ⚡⚡ |

📈 Higher SSIM and PSNR = better quality, Lower MSE = better accuracy

📝 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

🎯 Primary Foundation: Ttl's ComfyUi_NNLatentUpscale - This project builds directly upon Ttl's excellent work and neural network architecture. The core upscaling approach, model structure, and ComfyUI integration patterns are based on their pioneering implementation.
🤝 Additional Inspiration: naripok's contributions to the NN latent upscaling community
🖼️ ComfyUI Team: For the amazing framework that makes this possible
🤖 Model Teams: Wan2.2, Flux, Stability AI for their incredible models
🌟 Community: ComfyUI Discord community for feedback and support
📊 Datasets: DIV2K, COCO2017 for training data

Special Thanks to Ttl

This project would not exist without Ttl's groundbreaking work on neural network latent upscaling. Their original implementation provided:

The core neural network architecture for latent upscaling
ComfyUI node integration patterns
Training methodologies and loss functions
Proof of concept that NN upscaling significantly outperforms bilinear interpolation

Our contribution extends this foundation with universal model support, improved training data, and enhanced architectures while maintaining compatibility with the original approach.

🔗 Links

🏠 ComfyUI
📚 Documentation
🐛 Issues
💬 Discussions

🚀 Made with ❤️ for the ComfyUI community

If this project helps you, please consider giving it a ⭐!

</div>