ComfyUI Extension: ComfyUI JoyCaption-Beta-GGUF Node

Authored by judian17

Created

Updated

23 stars

This project provides a node for ComfyUI to use the JoyCaption-Beta model in GGUF format for image captioning.

Custom Nodes (0)

    README

    ComfyUI JoyCaption-Beta-GGUF Node

    This project provides a node for ComfyUI to use the JoyCaption-Beta model in GGUF format for image captioning.

    中文版说明

    Acknowledgments:

    This node is based on fpgaminer/joycaption_comfyui, with modifications to support the GGUF model format.

    Thanks to the LayerStyleAdvance, I copied the relevant code for extra options from it.

    Usage

    Installation

    This node requires llama-cpp-python to be installed.

    Important:

    • Installing with pip install llama-cpp-python will only enable CPU inference.
    • To utilize NVIDIA GPU acceleration, install with the following command:
      pip install llama-cpp-python --extra-index-url https://abetlen.github.io/llama-cpp-python/whl/cu124
      
      (Adjust cu124 according to your CUDA version)
    • For non-NVIDIA GPUs or other installation methods, please refer to the official llama-cpp-python documentation: https://llama-cpp-python.readthedocs.io/en/latest/

    llama-cpp-python is not listed in requirements.txt to allow users to manually install the correct version with GPU support.

    Workflow Example

    You can view an example workflow image at assets/example.png.

    Workflow Example

    Model Download and Placement

    You need to download the JoyCaption-Beta GGUF model and the corresponding mmproj model.

    1. Download the models from the following Hugging Face repositories:

    2. Place the downloaded model files into the models\llava_gguf\ folder within your ComfyUI installation directory.

    Video Tutorial

    You can refer to the following Bilibili video tutorial for setup and usage:

    Video