ComfyUI Node: Florence2Run

Authored by kijai

Created

Updated

832 stars

Category

Florence2

Inputs

image IMAGE
florence2_model FL2MODEL
text_input STRING
task
  • region_caption
  • dense_region_caption
  • region_proposal
  • caption
  • detailed_caption
  • more_detailed_caption
  • caption_to_phrase_grounding
  • referring_expression_segmentation
  • ocr
  • ocr_with_region
  • docvqa
fill_mask BOOLEAN
keep_model_loaded BOOLEAN
max_new_tokens INT
num_beams INT
do_sample BOOLEAN
output_mask_select STRING

Outputs

IMAGE

MASK

STRING

Extension: ComfyUI-Florence2

Nodes to use Florence2 VLM for image vision tasks: object detection, captioning, segmentation and ocr

Authored by kijai

Run ComfyUI workflows in the Cloud!

No downloads or installs are required. Pay only for active GPU usage, not idle time. No complex setups and dependency issues

Learn more