This is an implementation of a/Qwen2-VL-Instruct by a/ComfyUI, which includes, but is not limited to, support for text-based queries, video queries, single-image queries, and multi-image queries to generate captions or responses.
auto-gptq
module so setup would work without error.§ Notes on torch and python versions:
Eva-decord is a fork of decord that supports ffmpeg5 and builds/installs on newer macos versions. However, eva-decord is no longer available for Python version above 3.11.
Somehow, (incompatibility with later PyTorch?) the following error occurs when doing inference on PyTorch version above 2.4:
IndexError: Dimension out of range (expected to be in range of [-3, 2], but got 3)
Setting the attention mode to "eager" solved the problem. If you are using torch 2.4 or below,the original "sdpa" would work just fine.
If you have already have the original repository installed, remove it first.
Either install from the ComfyUI Manager using Install via Git URL and enter the URL of this repository, or,
Download or git clone this repository into the ComfyUI/custom_nodes/
directory, then install the python packages in your ComfyUI Python environment:
cd ComfyUI/custom_nodes/
git clone [email protected]:edwios/ComfyUI_Qwen2-VL-Instruct.git
cd ComfyUI_Qwen2-VL-Instruct
pip install -r requirements.txt
Restart ComfyUI when installation is done and successful.
This is an implementation of Qwen2-VL-Instruct by ComfyUI, which includes, but is not limited to, support for text-based queries, video queries, single-image queries, and multi-image queries to generate captions or responses.
Install from ComfyUI Manager (search for Qwen2
)
Download or git clone this repository into the ComfyUI\custom_nodes\
directory and run:
pip install -r requirements.txt
All the models will be downloaded automatically when running the workflow if they are not found in the ComfyUI\models\prompt_generator\
directory.