A ComfyUI extension for generating captions for your images. Runs on your own system, no external services used, no filter.
A ComfyUI extension for generating captions for your images. Runs on your own system, no external services used, no filter.
Uses various VLMs with APIs to generate captions for images. You can give instructions or ask questions in natural language.
Try asking for:
The model is quite capable of analysing NSFW images and returning NSFW replies.
It is unlikely to return an NSFW response to a SFW image, in my experience. It seems like this is because (1) the model's output is strongly conditioned on the contents of the image so it's hard to activate concepts that aren't pictured and (2) the VLM has had a hefty dose of safety-training.
This is probably for the best in general. But you will not have much success asking NSFW questions about SFW images.
git clone https://github.com/ceruleandeep/ComfyUI-ImageCaptioner
into your custom_nodes
folder
custom_nodes\ComfyUI-ImageCaptioner
custom_nodes/ComfyUI-ImageCaptioner
folder you just created
cd C:\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-ImageCaptioner
or wherever you have it installedpip install -r requirements.txt
Add the node via image
-> ImageCaptioner
Supports tagging and outputting multiple batched inputs.
U need to get the API of dashscope from the document