ComfyUI Node: Visual Query Template
Category
image
Inputs
images IMAGE
model
- Salesforce/blip-vqa-base
- Salesforce/blip-vqa-capfilt-large
- dandelin/vilt-b32-finetuned-vqa
- microsoft/git-large-vqav2
question STRING
Outputs
STRING
Extension: ComfyUI-VisualQueryTemplate
A ComfyUI node for transforming images into descriptive text using templated visual question answering. Leverages Hugging Face's VQA models with transformers
Authored by celoron
Run ComfyUI workflows in the Cloud!
No downloads or installs are required. Pay only for active GPU usage, not idle time. No complex setups and dependency issues
Learn more