ComfyUI Extension: GeminiOllama ComfyUI Extension

Authored by al-swaiti

Created

Updated

35 stars

This extension integrates Google's Gemini API and Ollama into ComfyUI, allowing users to leverage these powerful language models directly within their ComfyUI workflows.

Custom Nodes (0)

    README

    ComfyUI GeminiOllama Extension

    This extension integrates Google's Gemini API, OpenAI (ChatGPT), Anthropic's Claude, Ollama, Qwen, and various image processing tools into ComfyUI, allowing users to leverage these powerful models and features directly within their ComfyUI workflows.

    Features

    • Support for multiple AI APIs:
      • Google Gemini
      • OpenAI (ChatGPT)
      • Anthropic Claude
      • Ollama
      • Alibaba Qwen
    • Text and image input capabilities
    • Streaming option for real-time responses
    • FLUX Resolution tools for image sizing
    • ComfyUI Styler for advanced styling options
    • Raster to Vector (SVG) conversion
    • Text splitting and processing
    • Easy integration with ComfyUI workflows

    Nodes

    1. Gemini API

    • Models:
      • gemini-2.0-pro
      • gemini-2.0-flash
      • gemini-2.0-flash-lite-preview-02-05
      • gemini-2.0-pro-experimental-02-05
      • gemini-1.5-pro
      • gemini-1.5-flash-8b
      • gemini-1.5-pro-experimental
      • learnlm-1.5-pro-experimental

    2. OpenAI API

    • Models:
      • gpt-4o-mini
      • gpt-3.5-turbo
      • gpt-3.5-turbo-0125
      • gpt-3.5-turbo-16k
      • gpt-3.5-turbo-1106
      • o1-preview/mini
      • deepseek-ai/deepseek-r1

    3. Claude API

    Access Anthropic's Claude models for advanced language tasks:

    • Text input field for prompts
    • Model selection:
      • claude-3-opus
      • claude-3-sonnet
      • claude-3-haiku
    • Temperature control
    • System prompt configuration
    • Streaming capability

    4. Ollama API

    Integrate local language models running via Ollama:

    • Text input field for prompts
    • Dropdown for selecting Ollama models
    • Customizable model options

    5. Qwen API

    Access Alibaba's Qwen language models:

    • Text input field for prompts
    • Model selection:
      • qwen-turbo
      • qwen-plus
      • qwen-max
    • Temperature control
    • Streaming capability

    6. FLUX Resolutions

    Provides advanced image resolution and sizing options:

    • Predefined resolution presets (e.g., 768x1024, 1024x768, 1152x768)
    • Custom sizing parameters:
      • size_selected
      • multiply_factor
      • manual_width
      • manual_height

    7. ComfyUI Styler

    Extensive styling options for various creative needs:

    šŸŽØ General Arts ā€“ A broad spectrum of traditional and modern art styles šŸŒø Anime ā€“ Bring your designs to life with anime-inspired aesthetics šŸŽØ Artist ā€“ Channel the influence of world-class artists šŸ“· Camera ā€“ Fine-tune focal lengths, angles, and setups šŸ“ Camera Angles ā€“ Add dynamic perspectives with a range of angles šŸŒŸ Aesthetic ā€“ Define unique artistic vibes and styles šŸŽžļø Color Grading ā€“ Achieve rich cinematic tones and palettes šŸŽ¬ Movies ā€“ Get inspired by different cinematic worlds šŸ–Œļø Digital Artform ā€“ From vector art to abstract digital styles šŸ’Ŗ Body Type ā€“ Customize different body shapes and dimensions šŸ˜² Reactions ā€“ Capture authentic emotional expressions šŸ’­ Feelings ā€“ Set the emotional tone for each creation šŸ“ø Photographers ā€“ Infuse the style of renowned photographers šŸ’‡ Hair Style ā€“ Wide variety of hair designs for your characters šŸ›ļø Architecture Style ā€“ Classical to modern architectural themes šŸ› ļø Architect ā€“ Designs inspired by notable architects šŸš— Vehicle ā€“ Add cars, planes, or futuristic transportation šŸ•ŗ Poses ā€“ Customize dynamic body positions šŸ”¬ Science ā€“ Add futuristic, scientific elements šŸ‘— Clothing State ā€“ Define the wear and tear of clothing šŸ‘  Clothing Style ā€“ Wide range of fashion styles šŸŽØ Composition ā€“ Control the layout and arrangement of elements šŸ“ Depth ā€“ Add dimensionality and focus to your scenes šŸŒ Environment ā€“ From nature to urban settings, create rich backdrops šŸ˜Š Face ā€“ Customize facial expressions and emotions šŸ¦„ Fantasy ā€“ Bring magical and surreal elements into your visuals šŸŽƒ Filter ā€“ Apply unique visual filters for artistic effects šŸ–¤ Gothic ā€“ Channel dark, mysterious, and dramatic themes šŸ‘» Halloween ā€“ Get spooky with Halloween-inspired designs āœļø Line Art ā€“ Incorporate clean, bold lines into your creations šŸ’” Lighting ā€“ Set the mood with dramatic lighting effects āœˆļø Milehigh ā€“ Capture the essence of aviation and travel šŸŽ­ Mood ā€“ Set the emotional tone and atmosphere šŸŽžļø Movie Poster ā€“ Create dramatic, story-driven poster designs šŸŽø Punk ā€“ Channel bold, rebellious aesthetics šŸŒ Travel Poster ā€“ Design vintage travel posters with global vibes

    8. Raster to Vector (SVG) and Save SVG

    Convert raster images to vector graphics and save them:

    Raster to Vector node parameters:

    • colormode
    • filter_speckle
    • corner_threshold
    • ... (and more)

    Save SVG node options:

    • filename_prefix
    • overwrite_existing

    9. TextSplitByDelimiter

    Split text based on specified delimiters:

    • Input text field
    • Delimiter options:
      • split_regex
      • split_every
      • split_count

    Installation

    1. Clone this repository into your ComfyUI's custom_nodes directory:

      cd /path/to/ComfyUI/custom_nodes
      git clone https://github.com/yourusername/GeminiOllama.git
      
    2. Install the required dependencies:

      pip install google-generativeai openai anthropic requests vtracer
      

    Configuration

    API Key Setup

    Edit config.json: with your fav AI provider

    {
      "GEMINI_API_KEY": "your_gemini_api_key",
      "OPENAI_API_KEY": "your_openai_api_key",
      "ANTHROPIC_API_KEY": "your_claude_api_key",
      "OLLAMA_URL": "http://localhost:11434",
      "QWEN_API_KEY": "your_qwen_api_key"
    }
    
    1. Obtain API keys from:

    Usage

    After installation and configuration, new nodes for each API will be available in ComfyUI.

    Input Parameters

    • api_choice: Choose between "Gemini", "OpenAI", "Claude", and "Ollama"
    • prompt: The text prompt for the AI model
    • model_selection: Select the specific model for chosen API
    • temperature: Control response randomness (OpenAI and Claude)
    • system_message: Set system behavior (OpenAI and Claude)
    • stream: Enable/disable streaming responses
    • image (optional): Input image for vision-based tasks

    Output

    • text: The generated response from the chosen AI model

    Main Functions

    1. get_api_keys(): Retrieves API keys from the config file
    2. get_ollama_url(): Gets the Ollama URL from the config file
    3. generate_content(): Main function to generate content based on the chosen API and parameters
    4. generate_gemini_content(): Handles content generation for Gemini API
    5. generate_openai_content(): Manages content generation for OpenAI API
    6. generate_claude_content(): Handles content generation for Claude API
    7. generate_ollama_content(): Manages content generation for Ollama API
    8. tensor_to_image(): Converts a tensor to a PIL Image for vision-based tasks

    Contributing

    Contributions are welcome! Please feel free to submit a Pull Request.

    License

    This project is licensed under the MIT License - see the LICENSE file for details.