ComfyUI Extension: Comfy_HunyuanImage3

Authored by EricRollei

Created

Updated

2 stars

Professional ComfyUI custom nodes for Tencent's HunyuanImage-3.0 80B multimodal model, with multiple loading modes, smart memory management, large image generation with CPU offload, and optional prompt enhancement via LLM APIs.

Custom Nodes (0)

    README

    šŸŽØ HunyuanImage-3.0 ComfyUI Custom Nodes

    License: CC BY-NC 4.0 Python 3.10+ ComfyUI

    Professional ComfyUI custom nodes for Tencent HunyuanImage-3.0, the powerful 80B parameter native multimodal image generation model.

    šŸ™ Acknowledgment: This project integrates the HunyuanImage-3.0 model developed by Tencent Hunyuan Team and uses their official system prompts. The model and original code are licensed under Apache 2.0. This integration code is separately licensed under CC BY-NC 4.0 for non-commercial use.

    šŸŽÆ Features

    • Multiple Loading Modes: Full BF16, NF4 Quantized, Single GPU, Multi-GPU
    • Smart Memory Management: Automatic VRAM tracking, cleanup, and optimization
    • High-Quality Image Generation:
      • Standard generation (<2MP) - Fast, GPU-only
      • Large image generation (2MP-8MP+) - CPU offload support
    • Advanced Prompting:
      • Optional prompt enhancement using official HunyuanImage-3.0 system prompts
      • Supports any OpenAI-compatible LLM API (DeepSeek, OpenAI, Claude, local LLMs)
      • Two professional rewriting modes: en_recaption (structured) and en_think_recaption (advanced)
    • Professional Resolution Control:
      • Organized dropdown with portrait/landscape/square labels
      • Megapixel indicators and size categories
      • Auto resolution detection based on prompt
    • Production Ready: Comprehensive error handling, detailed logging, VRAM monitoring

    šŸ“¦ Installation

    Prerequisites

    • ComfyUI installed and working
    • NVIDIA GPU with CUDA support
    • Minimum 24GB VRAM for NF4 quantized model
    • Minimum 80GB VRAM (or multi-GPU) for full BF16 model
    • Python 3.10+
    • PyTorch 2.7+ with CUDA 12.8+

    Quick Install

    1. Clone this repository into your ComfyUI custom nodes folder:
    cd ComfyUI/custom_nodes
    git clone https://github.com/ericRollei/Eric_Hunyuan3.git
    
    1. Install dependencies:
    cd Eric_Hunyuan3
    pip install -r requirements.txt
    
    1. Download model weights:

    Option A: Full BF16 Model (~80GB)

    # Download to ComfyUI/models/
    cd ../../models
    huggingface-cli download tencent/HunyuanImage-3.0 --local-dir HunyuanImage-3
    

    Option B: Quantize to NF4 (~20GB) - Recommended for single GPU <96GB

    # First download full model, then quantize
    cd path/to/Eric_Hunyuan3/quantization
    python hunyuan_quantize_nf4.py \
      --model-path "../../models/HunyuanImage-3" \
      --output-path "../../models/HunyuanImage-3-NF4"
    
    1. Restart ComfyUI

    šŸš€ Usage

    Node Overview

    | Node Name | Purpose | VRAM Required | Speed | |-----------|---------|---------------|-------| | Hunyuan 3 Loader (NF4) | Load quantized model | ~45GB | Fast load | | Hunyuan 3 Loader (Full BF16) | Load full precision model | ~80GB | Moderate | | Hunyuan 3 Loader (Full BF16 GPU) | Single GPU with memory control | ~75GB+ | Moderate | | Hunyuan 3 Loader (Multi-GPU BF16) | Distribute across GPUs | 80GB total | Fast | | Hunyuan 3 Loader (88GB GPU Optimized) | For RTX 6000 Ada/Blackwell | ~75GB | Fastest | | Hunyuan 3 Generate | Standard generation (<2MP) | Varies | Fast ⚔ | | Hunyuan 3 Generate (Large/Offload) | Large images (2-8MP+) | Varies | Moderate | | Hunyuan 3 Unload | Free VRAM | - | Instant | | Hunyuan 3 GPU Info | Diagnostic/GPU detection | - | Instant |

    Basic Workflow

    ā”Œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”
    │ Hunyuan 3 Loader (NF4)  │
    │  model_name: HunyuanImage-3-NF4 │
    │  keep_in_cache: True    │
    ā””ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”¬ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”˜
                │ HUNYUAN_MODEL
                ā–¼
    ā”Œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”
    │ Hunyuan 3 Generate      │
    │  prompt: "..."          │
    │  steps: 50              │
    │  resolution: 1024x1024  │
    │  guidance_scale: 7.5    │
    ā””ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”¬ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”˜
                │ IMAGE
                ā–¼
         ā”Œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”
         │ Save Image   │
         ā””ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”˜
    

    Advanced: Prompt Rewriting

    ā”Œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”
    │ Hunyuan 3 Generate              │
    │  prompt: "dog running"          │
    │  enable_prompt_rewrite: True    │
    │  rewrite_style: universal       │
    │  deepseek_api_key: "sk-..."    │
    ā””ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”˜
    

    Result: Automatically expands to:

    "An energetic brown and white border collie running across a sun-drenched meadow filled with wildflowers, motion blur on legs showing speed, golden hour lighting, shallow depth of field, professional photography, high detail, 8k quality"

    Large Image Generation

    For high-resolution outputs (2K, 4K, 6MP+):

    ā”Œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”
    │ Hunyuan 3 Generate (Large)      │
    │  resolution: 3840x2160 - 4K UHD │
    │  cpu_offload: True              │
    │  steps: 50                      │
    ā””ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”˜
    

    šŸ“ Resolution Guide

    Standard Generation Node

    • 832x1280 - Portrait (1.0MP) [<2MP] āœ… Safe, fast
    • 1024x1024 - Square (1.0MP) [<2MP] āœ… Safe, fast
    • 1280x832 - Landscape (1.0MP) [<2MP] āœ… Safe, fast
    • 1536x1024 - Landscape (1.5MP) [<2MP] āœ… Safe, fast
    • 2048x2048 - Square (4.0MP) [>2MP] āš ļø May OOM

    Large/Offload Node

    • 2560x1440 - Landscape 2K (3.7MP) āœ… With CPU offload
    • 3840x2160 - Landscape 4K UHD (8.3MP) āœ… With CPU offload
    • 3072x2048 - Landscape 6MP (6.3MP) āœ… With CPU offload

    Tip: Test prompts at small resolutions (fast), then render finals in large node.

    šŸ”§ Configuration

    Memory Management

    Single GPU (24-48GB VRAM):

    Use: Hunyuan 3 Loader (NF4)
    Settings:
      - keep_in_cache: True (for multiple generations)
      - Use standard Generate node for <2MP
    

    Single GPU (80-96GB VRAM):

    Use: Hunyuan 3 Loader (88GB GPU Optimized)
    Settings:
      - reserve_memory_gb: 14.0 (leaves room for inference)
      - Full BF16 quality
    

    Multi-GPU Setup:

    Use: Hunyuan 3 Loader (Multi-GPU BF16)
    Settings:
      - primary_gpu: 0 (where inference runs)
      - reserve_memory_gb: 12.0
      - Automatically distributes across all GPUs
    

    LLM Prompt Rewriting (Optional)

    ✨ Feature: Uses official HunyuanImage-3.0 system prompts to professionally expand your prompts for better results.

    āš ļø Note: Requires a paid LLM API. This feature is optional - you can use the nodes without it.

    Supported APIs (any OpenAI-compatible endpoint):

    • DeepSeek (default, recommended for cost)
    • OpenAI GPT-4/GPT-3.5
    • Claude (via OpenAI-compatible proxy)
    • Local LLMs (via LM Studio, Ollama with OpenAI API)

    Setup Example (DeepSeek):

    1. Get API key: https://platform.deepseek.com/api_keys
    2. Add credits: https://platform.deepseek.com/top_up (~$1 lasts a long time)
    3. Configure in generation node:
      • api_key: Your API key (sk-...)
      • api_url: https://api.deepseek.com/v1/chat/completions (default)
      • model_name: deepseek-chat (default)

    Rewrite Styles (Official HunyuanImage-3.0 system prompts):

    • none: (Default) Use your original prompt without modification
    • en_recaption: Structured professional expansion with detailed descriptions (recommended)
      • Adds objective, physics-consistent details
      • Enhances lighting, composition, color descriptions
      • Best for general use
    • en_think_recaption: Advanced mode with thinking phase + detailed expansion
      • LLM analyzes your intent first, then creates detailed prompt
      • More comprehensive but uses more tokens
      • Best for complex or ambiguous prompts

    Example Results:

    Original: "a cat on a table"
    
    en_recaption: "A domestic short-hair cat with orange and white fur sits 
    upright on a wooden dining table. Soft afternoon sunlight streams through 
    a nearby window, casting warm highlights on the cat's fur and creating 
    gentle shadows on the table surface. The background shows a blurred 
    kitchen interior with neutral tones. Ultra-realistic, photographic style, 
    sharp focus on the cat, shallow depth of field, 8k resolution."
    

    If you get a "402 Payment Required" error, add credits to your API account or disable prompt rewriting.

    šŸ› ļø Quantization

    Create your own NF4 quantized model:

    cd quantization
    python hunyuan_quantize_nf4.py \
      --model-path "/path/to/HunyuanImage-3" \
      --output-path "/path/to/HunyuanImage-3-NF4"
    

    Benefits:

    • ~4x smaller (80GB → 20GB model size)
    • ~45GB VRAM usage (vs 80GB+ for BF16)
    • Minimal quality loss
    • Attention layers kept in full precision for stability

    šŸ“Š Performance Benchmarks

    RTX 6000 Ada (48GB) - NF4 Quantized:

    • Load time: ~35 seconds
    • 1024x1024 @ 50 steps: ~4 seconds/step
    • VRAM usage: ~45GB

    2x RTX 4090 (48GB each) - Multi-GPU BF16:

    • Load time: ~60 seconds
    • 1024x1024 @ 50 steps: ~3.5 seconds/step
    • VRAM usage: ~70GB + 10GB distributed

    RTX 6000 Blackwell (96GB) - Full BF16:

    • Load time: ~25 seconds
    • 1024x1024 @ 50 steps: ~3 seconds/step
    • VRAM usage: ~80GB

    šŸ› Troubleshooting

    Out of Memory Errors

    Solutions:

    1. Use NF4 quantized model instead of full BF16
    2. Reduce resolution (pick options marked [<2MP])
    3. Lower steps (try 30-40 instead of 50)
    4. Use "Hunyuan 3 Generate (Large/Offload)" node with cpu_offload: True
    5. Run "Hunyuan 3 Unload" node before generating
    6. Set keep_in_cache: False in loader

    Pixelated/Corrupted Output

    If using NF4 quantization:

    • Re-quantize with the updated script (includes attention layer fix)
    • Old quantized models may produce artifacts

    Multi-GPU Not Detecting Second GPU

    Check:

    1. Run "Hunyuan 3 GPU Info" node
    2. Look for CUDA_VISIBLE_DEVICES environment variable
    3. Ensure ComfyUI can see all GPUs: torch.cuda.device_count()

    Fix:

    # Remove GPU visibility restrictions
    unset CUDA_VISIBLE_DEVICES
    # Restart ComfyUI
    

    Slow Generation

    Optimizations:

    1. Use NF4 quantized model (faster than BF16)
    2. Reduce steps (30-40 is often sufficient)
    3. Keep model in cache (keep_in_cache: True)
    4. Use smaller resolutions for testing

    šŸ“ Advanced Tips

    Prompt Engineering

    Good prompts include:

    1. Subject: What is the main focus
    2. Action: What is happening
    3. Environment: Where it takes place
    4. Style: Artistic style, mood, atmosphere
    5. Technical: Lighting, composition, quality keywords

    Example:

    A majestic snow leopard prowling through a misty mountain forest at dawn,
    dappled golden light filtering through pine trees, shallow depth of field,
    wildlife photography, National Geographic style, 8k, highly detailed fur texture
    

    Note: HunyuanImage-3.0 uses an autoregressive architecture (like GPT) rather than diffusion, so it doesn't support negative prompts. Instead, be explicit in your prompt about what you want to include.

    Reproducible Results

    Set a specific seed (0-18446744073709551615) to get the same image:

    seed: 42  # Use any number, same seed = same image
    

    šŸ“š Model Information

    HunyuanImage-3.0:

    • Architecture: Native multimodal autoregressive transformer
    • Parameters: 80B total (13B active per token)
    • Experts: 64 experts (Mixture of Experts architecture)
    • Training: Text-to-image with RLHF post-training
    • License: Apache 2.0 (see Tencent repo for details)

    Paper: HunyuanImage 3.0 Technical Report

    Official Repo: Tencent-Hunyuan/HunyuanImage-3.0

    šŸ¤ Contributing

    Contributions welcome! Please:

    1. Fork the repository
    2. Create a feature branch
    3. Make your changes
    4. Test thoroughly
    5. Submit a pull request

    šŸ“„ License

    šŸ“„ License

    Dual License (Non-Commercial and Commercial Use):

    1. Non-Commercial Use: Licensed under Creative Commons Attribution-NonCommercial 4.0 International License
    2. Commercial Use: Requires separate license. Contact [email protected] or [email protected]

    See LICENSE for full details.

    Note: The HunyuanImage-3.0 model itself is licensed under Apache 2.0 by Tencent. This license only covers the ComfyUI integration code.

    Copyright (c) 2025 Eric Hiss. All rights reserved.

    šŸ™ Credits

    ComfyUI Integration

    • Author: Eric Hiss (GitHub: EricRollei)
    • License: CC BY-NC 4.0 (Non-Commercial) / Commercial License Available

    HunyuanImage-3.0 Model

    Special Thanks

    • Tencent Hunyuan Team for creating and open-sourcing the incredible HunyuanImage-3.0 model
    • ComfyUI Community for the excellent extensible framework
    • All contributors and testers

    šŸ†˜ Support

    šŸ”„ Changelog

    v1.0.0 (2025-11-17)

    • Initial release
    • Full BF16 and NF4 quantized model support
    • Multi-GPU loading support
    • Optional prompt rewriting with DeepSeek API
    • Improved resolution organization
    • Large image generation with CPU offload
    • Comprehensive error handling and VRAM management

    Made with ā¤ļø for the ComfyUI community