ComfyUI Extension: ComfyUI-Hangover-Nodes

Authored by Hangover3832

Created

Updated

40 stars

Nodes: MS kosmos-2 Interrogator, Save Image w/o Metadata, Image Scale Bounding Box. An implementation of Microsoft a/kosmos-2 image to text transformer.

README

Custom nodes for ComfyUI

ComfyUI is the awesome stable diffusion GUI and backend.

Please note that this repository is currently a (learning) work in progress and might change anytime. It has been tested in Windows 10 only so far.

Updates

  • Save Image now has only 1 button to include/exclude metadata. You now can disable saving the image, thus it acts as an image preview only. Note that you can still save the image, by right clicking on it, that will include or exclude the workflow according to the setting. 🆕
  • kosmos-2 interrogator now outputs a mask indicating the recognized elements in the image. 🆕
  • "Image Scale To Bounding Box" now can pad the image to its given box size (examples). 🆕
  • New node: Make inpainting model. 🆕
  • Kosmos-2 update: Ability to strip out the prompt from the response (credit to za-wa-n-go). 🆕
  • Kosmos-2 now supports manually downloadad model folder location. 🆕
  • Scale Image Bouning Box can now be used in baches. 🆕

Nodes overview:

  • Kosmos-2: Grounding Multimodal Large Language Models to the World
  • Stable Diffusion Privacy: Save Image with or without Metadata
  • Scale an Image To A Bounding Box
  • Easily make an inpainting version of any SD1.5 model

Node: Microsoft kosmos-2 for ComfyUI

An implementation of Microsoft kosmos-2 text & image to text transformer .

This node takes a prompt that can influence the output, for example, if you put "Very detailed, an image of", it outputs more details than just "An image of". kosmos-2 is quite impressive, it recognizes famous people and written text in the image:

Alt text kosmos-2 output: An image of Donald Trump giving the peace sign with the words "Make America Great Again" written next to him.

At the first start, the kosmos-2 model files will be downloaded from huggingface. Please be patient. The model file is about 6GB in size. There is a cpu/gpu selector, but be aware that the model will eat up about 6GB of your precious VRAM in gpu mode! Alternatively, the model can be downloadad manually. Place the files in a folder named "kosmos-2-patch14-224" under the ./ComfyUI/models/kosmos2 folder. ./ComfyUI/models/kosmos2/kosmos-2-patch14-224 must contain the following files:

added_tokens.json
config.json
generation_config.json
model.safetensors
preprocessor_config.json
sentencepiece.bpe.model
special_tokens_map.json
tokenizer.json
tokenizer_config.json

The kosmos2 base folder can also be configured in extra_model_paths.yaml

See example outputs and workflows

Also see Moondream, Recognize Anything Model


Node: Save Image w/o Metadata

With this custom save image node, you can preview or save, include or exclude the ComfyUI workflow metadata in the image. It is a derivation of ComfyUI's built-in save image node. Note that you can always right click on the image to save, it will also include the workflow if activated.


Node: Scale Image To Bounding Box

This node scales an input image into a given box size, whereby the aspect ratio keeps retained. The image can also be padded to the full box size with an arbitrary color.

Alt text See example outputs and workflows


Node: Make Inpainting Model for SD1.5

Alt text This node easy creates an inpainting version of any SD1.5 model on the fly. No need to have GB's of inpainting models laying on your drive. This is very useful for any kind of inpainting nodes like detailers. Make sure you have the original SD1.5 models from RunwayML in your models folder:

They are needed for the calculation.

See examples


Installation

Unzip or git clone this repository into ComfyUI/custom_nodes folder and restart ComfyUI.