you can using pic2story in comfyUI
ComfyUI simple node based on BLIP method, with the function of "Image to Txt " .
Original model: link
Using model: link
1.1 In the .\ComfyUI \ custom_node directory, run the following:
git clone https://github.com/smthemex/ComfyUI_Pic2Story.git
1.2 using repo_id or offline
repo_id: abhijit2111/Pic2Story link
repo_id: google/paligemma2-3b-pt-896 link
Prompt is not necessary! 提示词不是必须的,可以去掉.
@misc{https://doi.org/10.48550/arxiv.2201.12086,
doi = {10.48550/ARXIV.2201.12086},
url = {https://arxiv.org/abs/2201.12086},
author = {Li, Junnan and Li, Dongxu and Xiong, Caiming and Hoi, Steven},
keywords = {Computer Vision and Pattern Recognition (cs.CV), FOS: Computer and information sciences, FOS: Computer and information sciences},
title = {BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation},
publisher = {arXiv},
year = {2022},
copyright = {Creative Commons Attribution 4.0 International}
}