ComfyUI Node: Paligemma
Category
VLM Nodes/Paligemma
Inputs
image IMAGE
model_id
- gokaygokay/sd3-long-captioner-v2
- Custom
- google/paligemma-3b-ft-refcoco-seg-896
- google/paligemma-3b-ft-cococap-448
- google/paligemma-3b-ft-rsvqa-hr-224
- google/paligemma-3b-ft-science-qa-448
- google/paligemma-3b-ft-vqav2-448
- google/paligemma-3b-mix-224
- google/paligemma-3b-mix-224-jax
- google/paligemma-3b-mix-224-keras
- google/paligemma-3b-mix-448
- google/paligemma-3b-mix-448-jax
- google/paligemma-3b-mix-448-keras
- google/paligemma-3b-pt-224
- google/paligemma-3b-pt-224-jax
- google/paligemma-3b-pt-224-keras
- google/paligemma-3b-pt-448
- google/paligemma-3b-pt-448-jax
- google/paligemma-3b-pt-448-keras
- google/paligemma-3b-pt-896
- google/paligemma-3b-pt-896-jax
- google/paligemma-3b-pt-896-keras
custom_model_id STRING
task_type
- Captioning
- Segmentation
- Question Answering
prompt STRING
precision
- bfloat16
- float32
device
- cpu
- cuda
- cuda:0
- cuda:1
- cuda:2
- cuda:3
quantization
- None
- 8bit
- 4bit
max_tokens INT
min_tokens INT
temperature FLOAT
fill_mask
- True
- False
mask_color STRING
mask_opacity FLOAT
mask_threshold FLOAT
mask_blur INT
Outputs
STRING
MASK
IMAGE
Extension: VLM_nodes
Custom Nodes for Vision Language Models (VLM) , Large Language Models (LLM), Image Captioning, Automatic Prompt Generation, Creative and Consistent Prompt Suggestion, Keyword Extraction
Authored by gokayfem
Run ComfyUI workflows in the Cloud!
No downloads or installs are required. Pay only for active GPU usage, not idle time. No complex setups and dependency issues
Learn more