ComfyUI Extension: Divergent Nodes

Authored by thedivergentai

Created

Updated

0 stars

This repository contains a collection of custom nodes for ComfyUI designed to integrate external AI models, provide utilities, and enable advanced workflows.

Custom Nodes (0)

    README

    Divergent Nodes - Custom ComfyUI Nodes

    This repository contains a collection of custom nodes for ComfyUI designed to integrate external AI models, provide utilities, and enable advanced workflows.

    Installation

    1. Clone Repository: Navigate to your ComfyUI/custom_nodes/ directory and clone this repository:

      cd ComfyUI/custom_nodes/
      git clone https://github.com/thedivergentai/divergent_nodes.git divergent_nodes
      

      (Note: If you cloned previously, you can update with git pull inside the divergent_nodes directory)

    2. Install Dependencies: Install the required Python packages:

      cd divergent_nodes
      pip install -r requirements.txt
      
    3. Set up API Key (for Gemini Node):

      • Create a file named .env in the divergent_nodes directory (this directory).
      • Add your Google AI Studio API key to the .env file in the following format:
        GEMINI_API_KEY=YOUR_API_KEY_HERE
        
      • Refer to the .env.example file for guidance.
      • The .env file is included in .gitignore and will not be tracked by Git.
    4. Restart ComfyUI: Ensure you fully restart the ComfyUI server after installation/updates.

    The nodes should now appear in the ComfyUI node menu under their respective categories.

    Included Nodes

    This pack currently includes the following nodes:

    • CLIP Token Counter (Divergent Nodes 👽/Text Utils): Counts tokens for given text using a selected CLIP tokenizer.
    • Gemini API Node (Divergent Nodes 👽/Gemini): Generates text (optionally using image input) via the Google Gemini API. Requires API key.
    • KoboldCpp Launcher (Divergent Nodes 👽/KoboldCpp): Launches and manages a local KoboldCpp instance for text generation. Requires KoboldCpp executable and model paths.
    • KoboldCpp API Connector (Divergent Nodes 👽/KoboldCpp): Connects to an already running KoboldCpp instance for text generation.
    • LoRA Strength XY Plot (Divergent Nodes 👽/XY Plots): Generates an image grid comparing different LoRAs (X-axis) against varying model strengths (Y-axis).
    • Save Image Enhanced (👽 Divergent Nodes/Image): Saves images with enhanced options including custom output folder, filename prefixing, and optional caption saving.
    • MusiQ Image Scorer (Divergent AI 👽/Image): Scores images based on aesthetic and technical quality using Google's MusiQ models.

    MusiQ Image Scorer

    Scores images based on aesthetic and technical quality using Google's MusiQ models. It allows you to select which type of scoring to perform (aesthetic, technical, or both) and which specific technical model to use.

    Inputs:

    • image (IMAGE): The image to be scored.
    • aesthetic_model (COMBO): The aesthetic model to use (currently only AVA).
    • technical_model (COMBO): The technical model to use (KonIQ-10k, SPAQ, PaQ-2-PiQ).
    • score_aesthetic (BOOLEAN): Enable/disable aesthetic scoring.
    • score_technical (BOOLEAN): Enable/disable technical scoring.

    Outputs:

    • AESTHETIC_SCORE (FLOAT): The aesthetic quality score (0.0 if disabled or error).
    • TECHNICAL_SCORE (FLOAT): The technical quality score (0.0 if disabled or error).
    • ERROR_MESSAGE (STRING): Any error or warning messages during scoring.

    Category: Divergent AI 👽/Image


    CLIP Token Counter

    Counts the number of tokens generated by a CLIP tokenizer for the input text. Useful for understanding prompt length limits.

    Inputs:

    • text (STRING): The text string you want to analyze.
    • tokenizer_name (COMBO): The name of the Hugging Face CLIP tokenizer model to use (e.g., openai/clip-vit-base-patch32, stabilityai/stable-diffusion-clip-vit-large-patch14).

    Outputs:

    • token_count (INT): The total number of tokens generated for the input text by the selected tokenizer (including special tokens).

    Category: Divergent Nodes 👽/Text Utils


    Gemini API Node

    Connects to the Google Gemini API to generate text based on a prompt and optional image input. Requires a GEMINI_API_KEY environment variable (e.g., in a .env file in the ComfyUI root or a parent directory). It dynamically fetches the list of available models supporting content generation when ComfyUI loads the node.

    Inputs:

    • model (COMBO): Select the Gemini model to use (list dynamically fetched if API key is valid).
    • prompt (STRING): The text prompt for generation.
    • image_optional (IMAGE): Optional image input for multimodal models (e.g., gemini-pro-vision, gemini-1.5-*).
    • temperature (FLOAT): Controls randomness. Higher values (e.g., 1.0) are more creative, lower values (e.g., 0.2) are more deterministic. (0.0-2.0)
    • top_p (FLOAT): Nucleus sampling probability threshold (e.g., 0.95). 1.0 disables. (0.0-1.0)
    • top_k (INT): Top-K sampling threshold (consider probability of token). (>= 1)
    • max_output_tokens (INT): Maximum number of tokens to generate in the response. (>= 1)
    • safety_harassment / safety_hate_speech / safety_sexually_explicit / safety_dangerous_content (COMBO): Safety thresholds for different safety categories.

    Outputs:

    • text (STRING): The generated text response from the Gemini API, or an error/status message.

    Category: 👽 Divergent Nodes/Gemini


    KoboldCpp Launcher (Advanced)

    Launches and manages a local KoboldCpp instance (e.g., koboldcpp.exe) in the background for text generation. Provides control over launch parameters and uses the instance's API. Caches running instances based on setup parameters.

    Prerequisites:

    • A working KoboldCpp executable. Download from KoboldCpp Releases.
    • Path to the executable provided via koboldcpp_path.
    • Path to the .gguf model file provided via model_path.
    • (Optional) Path to .gguf multimodal projector via mmproj_path for image input.
    • requests library installed (pip install -r requirements.txt).

    How it Works:

    1. Launch/Cache: Checks cache for a matching running instance. If none, terminates other instances and launches a new one with specified setup args. Waits for API readiness.
    2. API Call: Sends generation args (prompt, image, settings) to the managed instance's /api/v1/generate endpoint.
    3. Cleanup: Attempts atexit cleanup of cached processes.

    Inputs:

    • Setup Arguments (Define Instance):
      • koboldcpp_path (STRING): Full path to KoboldCpp executable.
      • model_path (STRING): Full path to .gguf model.
      • gpu_acceleration (COMBO): Backend ("None", "CuBLAS", etc.).
      • n_gpu_layers (INT): Layers to offload (-1=auto).
      • context_size (INT): Context size.
      • mmproj_path (STRING): Optional path to multimodal projector.
      • threads (INT): CPU threads (0=auto).
      • use_mmap / use_mlock / flash_attention (BOOLEAN): Performance flags.
      • quant_kv (COMBO): KV cache quantization.
      • extra_cli_args (STRING): Additional setup flags for KoboldCpp CLI.
    • Generation Arguments (Sent via API):
      • prompt (STRING): Text prompt.
      • max_length (INT): Max generation length.
      • temperature / top_p / top_k / rep_pen (FLOAT/INT): Sampling parameters.
      • image_optional (IMAGE): Optional image input (requires mmproj_path).
      • stop_sequence (STRING): Optional comma/newline separated stop sequences.

    Outputs:

    • text (STRING): Generated text or error message.

    Category: Divergent Nodes 👽/KoboldCpp


    KoboldCpp API Connector (Basic)

    Connects to an already running KoboldCpp instance via its API for text generation. Does not launch or manage the process.

    Prerequisites:

    • KoboldCpp instance running and accessible at the specified api_url.
    • requests library installed (pip install -r requirements.txt).

    How it Works:

    1. Check Connection: Verifies API reachability.
    2. Prepare Payload: Creates JSON with prompt, image (if provided, converted to Base64), and generation settings.
    3. API Call: Sends POST request to /api/v1/generate.
    4. Response: Parses response and returns text.

    Inputs:

    • api_url (STRING): Base URL of the running KoboldCpp API (e.g., http://localhost:5001).
    • prompt (STRING): Text prompt.
    • max_length / temperature / top_p / top_k / rep_pen: Generation parameters.
    • image_optional (IMAGE): Optional image input (requires compatible model/mmproj loaded on the running instance).
    • stop_sequence (STRING): Optional comma/newline separated stop sequences.

    Outputs:

    • text (STRING): Generated text or error message.

    Category: Divergent Nodes 👽/KoboldCpp


    LoRA Strength XY Plot

    Generates an image grid comparing different LoRAs (X-axis) against varying model strengths (Y-axis).

    How it Works:

    1. Loads a base checkpoint model.
    2. Scans the specified lora_folder_path for LoRA files.
    3. Determines which LoRAs and strength values to use based on x_lora_steps and y_strength_steps.
    4. Iterates through each combination:
      • Applies the selected LoRA (or none for the baseline column) with the current strength to copies of the base model/CLIP.
      • Generates an image using the provided prompt, latent, and sampling settings.
      • Optionally saves the individual image.
    5. Assembles all generated images into a grid.
    6. Optionally draws labels (LoRA names, strengths) onto the grid.

    Inputs:

    • checkpoint_name (COMBO): Base model checkpoint.
    • lora_folder_path (STRING): Path to folder containing LoRA files.
    • positive / negative (CONDITIONING): Text conditioning.
    • latent_image (LATENT): Initial latent image.
    • seed / steps / cfg / sampler_name / scheduler: Sampling parameters.
    • x_lora_steps (INT): Number of LoRAs for X-axis (0=all).
    • y_strength_steps (INT): Number of strength steps for Y-axis.
    • max_strength (FLOAT): Maximum LoRA strength for Y-axis.
    • opt_clip / opt_vae (CLIP/VAE): Optional CLIP/VAE overrides.
    • save_individual_images (BOOLEAN): Save each grid cell image.
    • output_folder_name (STRING): Subfolder name for saved images.
    • row_gap / col_gap (INT): Gaps in the final grid.
    • draw_labels (BOOLEAN): Draw labels on the grid.
    • x_axis_label / y_axis_label (STRING): Optional overall axis labels.

    Outputs:

    • xy_plot_image (IMAGE): The final generated XY plot grid image.

    Category: Divergent Nodes 👽/XY Plots


    Save Image Enhanced

    Saves the input images to a specified directory with optional caption and filename counter. Provides enhanced options compared to the standard Save Image node, including custom output folder and flexible filename prefixing.

    Inputs:

    • images (IMAGE): The images to save.
    • filename_prefix (STRING, default: "ComfyUI_DN_%date:yyyy-MM-dd%"): The prefix for the file to save. This may include formatting information such as %date:yyyy-MM-dd% or %Empty Latent Image.width% to include values from nodes. Supports %batch_num%.
    • output_folder (STRING, default: "output"): The folder to save the images to. Can be relative to ComfyUI's output or an absolute path.
    • add_counter_suffix (BOOLEAN, default: True): If True, adds an incrementing numerical suffix (_00001_) to the filename to prevent overwriting.
    • caption_file_extension (STRING, default: ".txt"): The extension for the caption file. (Optional)
    • caption (STRING): String to save as a caption file. (Optional, Force Input)

    Outputs:

    • last_filename_saved (STRING): The full path of the last image file saved.

    Category: 👽 Divergent Nodes/Image


    Example Workflows

    (Placeholder: Add links or descriptions of example .json workflows demonstrating node usage once created in the examples/ directory.)

    Contributing

    (Placeholder: Add contribution guidelines if applicable.)

    License

    (Placeholder: Add license information, e.g., MIT License.)