Using image caption models to extract prompts in ComfyUI
Image caption node for ComfyUI. You can load your image caption model and generate prompts with the given picture.
Insert prompt node is added here to help the users to add their prompts easily.
Clone the repository to the custom_nodes
folder
Run ComfyUI
Place the folder which contains your model under the models/image_captioners
folder
Click Refresh
button in ComfyUI, if it didn't work restart ComfyUI
[x] Processor has to be in the folder
Prompt Generator
node in this repositoryYou can find the models in this link
For to use the pretrained model follow these steps:
models/image_captioners
folder.Refresh
button in ComfyUImodel_name
variable (If you can't see the generator, restart ComfyUI).female_image_caption_blip | (Training In Process)
female_image_caption_git | (Training In Process)
| Variable Names | Definitions | | :--------------------: | :--------------------------------------------------------------------------------------------------- | | model_name | Folder name that contains the model | | min_new_tokens | The minimum numbers of tokens to generate, ignoring the number of tokens in the prompt. | | max_new_tokens | The maximum numbers of tokens to generate, ignoring the number of tokens in the prompt. | | num_beams | Number of steps for each search path | | penalty_alpha | The values balance the model confidence and the degeneration penalty in contrastive search decoding. | | top_k | The number of highest probability vocabulary tokens to keep for top-k-filtering. | | repetition_penalty | The parameter for repetition penalty. 1.0 means no penalty |
min_new_tokens
and max_new_tokens
variables' values will help to generate more accurate prompts.| Variable Names | Definitions |
| :------------: | :-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| prompt_string | Want to be inserted prompt. It is replaced with {prompt_string}
part in the prompt_format variable |
| prompt_format | New prompts with including prompt_string
variable's value with {prompt_string}
syntax. For example, prompt_string
value is hdr
and prompt_format
value is 1girl, solo, {prompt_string}
. Then the output is 1girl, solo, hdr
. The {prompt_string}
syntax will be added anywhere in the string. |
prompt_string
variable to inputPrompt Generator
.bug
labelThe image caption node is based on transformers package. So most of the problems may be caused from this package. For overcome these problems you can try to update package:
For Manual Installation of the ComfyUI
pip install --upgrade transformers
command.For Portable Installation of the ComfyUI
ComfyUI_windows_portable
folder..\python_embeded\python.exe -s -m pip install --upgrade transformers
command.Contributions are welcome. If you have an idea and want to implement it by yourself please follow these steps:
If you have an idea but don't know how to implement it, please create an issue with enhancement
label.
[x] The contributing can be done in several ways. You can contribute to code or to README file.
| Reference (Used) Image | Output | | :------------------------------------: | :------------------------------: | | | | | | |