ComfyUI Node: 🦙 Ollama Image Captioner 🦙

Authored by alisson-anjos

Created

Updated

66 stars

Category

Ollama

Inputs

model
  • llava:7b-v1.6-vicuna-q2_K (Q2_K, 3.2GB)
  • llava:7b-v1.6-mistral-q2_K (Q2_K, 3.3GB)
  • llava:7b-v1.6 (Q4_0, 4.7GB)
  • llava:13b-v1.6 (Q4_0, 8.0GB)
  • llava:34b-v1.6 (Q4_0, 20.0GB)
  • llava-llama3:8b (Q4_K_M, 5.5GB)
  • llava-phi3:3.8b (Q4_K_M, 2.9GB)
  • llama3.2-vision:11b (Q4_K_M, 7.9GB)
  • minicpm-v:8b (Q4_0, 5.5GB)
  • moondream:1.8b (Q4, 1.7GB)
  • moondream:1.8b-v2-q6_K (Q6, 2.1GB)
  • moondream:1.8b-v2-fp16 (F16, 3.7GB)
custom_model STRING
api_host STRING
timeout INT
input_dir STRING
output_dir STRING
max_images INT
low_vram BOOLEAN
keep_model_alive INT
top_p FLOAT
temperature FLOAT
caption_type
  • Descriptive
  • Descriptive (Informal)
  • FLUX/SD3+
  • MidJourney
  • Booru-like tag list
caption_length
  • any
  • very short
  • short
  • medium-length
  • long
  • very long
  • 20
  • 30
  • 40
  • 50
  • 60
  • 70
  • 80
  • 90
  • 100
  • 110
  • 120
  • 130
  • 140
  • 150
  • 160
  • 170
  • 180
  • 190
  • 200
  • 210
  • 220
  • 230
  • 240
  • 250
  • 260
name STRING
custom_prompt STRING
prefix_caption STRING
suffix_caption STRING
extra_options Extra_Options
structured_output_format STRING

Outputs

STRING

Extension: ComfyUI-Ollama-Describer

A ComfyUI extension that allows you to use some LLM templates provided by Ollama, such as Gemma, Llava (multimodal), Llama2, Llama3 or Mistral

Authored by alisson-anjos

Run ComfyUI workflows in the Cloud!

No downloads or installs are required. Pay only for active GPU usage, not idle time. No complex setups and dependency issues

Learn more