ComfyUI Extension: ComfyUI-GGUF-VisionLM

Authored by walke2019

Created

Updated

0 stars

ComfyUI nodes for running GGUF quantized Qwen2.5-VL models using llama.cpp

Custom Nodes (0)

    README

    ComfyUI-GGUF-VisionLM

    Run GGUF Vision Language Models locally in ComfyUI. Text, image, and video analysis with low memory. Supports Qwen2.5-VL, Qwen3-VL, Llama 3, LLaVA, MiniCPM-V, Moondream, and more VLMs on consumer hardware.

    License: MIT Python 3.8+

    ✨ Features

    • 🚀 Local Execution - Run VLMs completely locally on your hardware
    • 💾 Low Memory - GGUF quantization (Q4_K_M, Q8_0, etc.)
    • 🎯 Multi-Modal - Text generation, image analysis, video understanding
    • 📦 Auto Download - One-click model download from dropdown
    • 🤖 Smart Matching - Automatic mmproj detection and download
    • ⚙️ YAML Config - Easy model management without code changes
    • 🔄 Batch Processing - Process multiple images at once

    📦 Supported Models

    Vision Language Models

    • Qwen Series: 2.5-VL (3B/7B), 3-VL (8B)
    • Llama 3: Vision models
    • LLaVA: Multiple variants
    • MiniCPM-V: Lightweight VLMs
    • Moondream: Ultra-light models
    • And more...

    Total: 8+ model families, 16+ variants

    🚀 Quick Start

    1. Installation

    cd ComfyUI/custom_nodes
    git clone https://github.com/walke2019/ComfyUI-GGUF-VisionLM
    cd ComfyUI-GGUF-VisionLM
    pip install -r requirements.txt
    

    2. Install llama-cpp-python

    CUDA (NVIDIA GPU):

    CMAKE_ARGS="-DGGML_CUDA=on" pip install llama-cpp-python
    

    CPU only:

    pip install llama-cpp-python
    

    Metal (macOS):

    CMAKE_ARGS="-DGGML_METAL=on" pip install llama-cpp-python
    

    3. Usage

    1. Restart ComfyUI
    2. Add node: 🔥 Qwen2.5-VL GGUF (All-in-One)
    3. Select model from dropdown:
      • Local models: Direct filename
      • Downloadable: [⬇️ Series] filename
    4. Connect image input and execute

    That's it! Models auto-download on first use.

    📖 Nodes

    🔥 Qwen2.5-VL GGUF (All-in-One)

    All-in-one node combining model loading and inference.

    Key Parameters:

    • model - Select from dropdown (local or downloadable)
    • image - Input image
    • prompt - Description prompt (default: "Describe this image in detail.")
    • max_tokens - Max generation length (512)
    • temperature - Sampling temperature (0.7)
    • n_ctx - Context window (4096)
    • n_gpu_layers - GPU layers (-1 = all)

    Output:

    • description - Generated text

    Other Nodes

    • Load Qwen2.5-VL GGUF Model - Separate model loading
    • Qwen2.5-VL GGUF Describe Image - Single image description
    • Qwen2.5-VL GGUF Batch Describe - Batch processing

    🔧 Configuration

    Adding New Models

    Edit model_registry.yaml:

    vision_language_models:
      your-series:
        series_name: "Your Series"
        business_type: "image_analysis"
        models:
          - model_name: "Your-Model-7B"
            repo: "user/repo-GGUF"
            mmproj: "mmproj-file.gguf"
            variants:
              - name: "Q4_K_M"
                file: "model-q4.gguf"
                size: "~4GB"
                recommended: true
    

    Restart ComfyUI to see new models.

    Model Registry

    All models configured in model_registry.yaml:

    • Image Analysis - Vision Language Models
    • Text Generation - Text-only models
    • Video Analysis - Video understanding models

    💡 Tips

    Performance

    • Use Q4_K_M for best quality/speed balance
    • Set n_gpu_layers=-1 for full GPU usage
    • Enable Flash Attention for faster inference

    Network

    • First download may take time (models are large)
    • Use proxy if needed for HuggingFace access
    • Downloads are cached and resumable

    Troubleshooting

    Model not showing?

    • Check model_registry.yaml syntax
    • Run python3 test_registry.py to verify
    • Restart ComfyUI

    Download failed?

    • Check network connection
    • Verify HuggingFace repo URL
    • Check available disk space

    mmproj not found?

    • System auto-downloads missing mmproj
    • Or manually specify in mmproj_path parameter

    📚 Documentation

    • Quick Start: See above
    • Model Registry: Edit model_registry.yaml
    • Testing: Run python3 test_registry.py

    🤝 Contributing

    Contributions welcome! To add a new model:

    1. Edit model_registry.yaml
    2. Add model configuration
    3. Test with test_registry.py
    4. Submit PR

    📄 License

    MIT License - see LICENSE file

    🙏 Acknowledgments

    🔗 Links


    Made with ❤️ for the ComfyUI community

    If you find this useful, please ⭐ star the repo!