ComfyUI Extension: rafacostComfy
A ComfyUI custom node for DreamOmni2 GGUF multimodal models — powered directly by llama-cpp-python.
Custom Nodes (0)
README
ComfyUI-DreamOmni2-GGUF
A ComfyUI custom node for DreamOmni2 GGUF multimodal models — powered directly by llama-cpp-python, no external executables required.
⚠️ Work in Progress — use at your own risk. <sub>(Or better: fork and help improve it.)</sub>
✨ Features
-
Run DreamOmni2 GGUF models natively inside ComfyUI.
-
Full image + text multimodal support through the
llama-cpp-pythonbackend. -
Accepts up to four image inputs simultaneously.
-
Outputs either:
- conditioning embeddings (for generation workflows), or
- text descriptions (for captioning or analysis).
-
Built-in seeded cache system for deterministic results across sessions.
-
No dependency on external binaries or CLI tools.
🧩 Prerequisites
Requires Python ≥ 3.12 and a working ComfyUI ≥ 0.3.66.
Install dependencies (CPU or CUDA builds supported):
pip install -r requirements.txt
requirements.txt
torch>=2.2.0
numpy
Pillow
scikit-build-core
llama-cpp-python>=0.3.16
For CUDA acceleration, install the matching wheel from llama-cpp-python (CUDA builds).
Example (Windows CUDA 12.x):
pip install llama_cpp_python-0.3.16+cu124-cp312-cp312-win_amd64.whl
🧠 Installation
-
Clone this repository into your ComfyUI
custom_nodesdirectory:git clone https://github.com/rafacost/rafacost-comfy.git -
Download the DreamOmni2 GGUF and mmproj models from Hugging Face.
-
Place them in:
ComfyUI/models/unet/ ├── DreamOmni2-Vlm-Model-7.6B-Q5_K.gguf ├── DreamOmni2-Vlm-Model-7.6B-f16-mmproj.gguf ├── flux1-kontext-dev-Q5_K.gguf ComfyUI/models/loras/ ├── DreamOmni2-7.6B-Edit-Lora.safetensors ├── DreamOmni2-7.6B-Gen-Lora.safetensors
⚙️ Usage
-
Launch ComfyUI.
-
Add the DreamOmni2-VLM node under rafacostComfy / VLM.
-
Configure the node:
- Model – select your
DreamOmni2.ggufmodel. - MMProj Path – select your vision projection
.gguf. - Images – connect up to four
Load Imagenodes. - Prompt – enter a description or instruction.
- Seed – for deterministic results (cache linked).
- Use Cache – toggle to reuse previous generations.
- As Conditioning – output embeddings instead of raw text.
- Model – select your
-
Optionally connect a DreamOmni2 Output Node to visualize the generated text inside the workflow.
🧪 Example
Use the sample workflow provided in workflows/ or connect images manually.
Outputs appear both in the ComfyUI graph and in the terminal console.
⚙️ Troubleshooting
| Issue | Cause | Fix |
| ------------------------------------------------ | -------------------------------------- | ------------------------------------------------ |
| ImportError: cannot import Qwen25VLChatHandler | Missing or outdated llama-cpp-python | pip install --upgrade llama-cpp-python>=0.3.16 |
| Model loads but crashes | Out-of-memory | Try lower quant (e.g. Q4_K) |
| No output text | Prompt too short / token limit | Increase max_tokens |
| Different outputs for same seed | Cache disabled or model reset | Enable Use Cache |
| “NoneType object” errors | Missing mmproj | Verify both GGUF files are present |
| Low Image Quality | Low Base Model GGUF quant | Try a higher GGUF quant for BaseModel (eg. use flux_kontext:Q8_0) |
| Low Prompt Adherence | Low DreamOmni2 GGUF quant | Try a higher GGUF quant for DreamOmni2 (eg. use DreamOmni2-GGUF:Q8_0) |
|Slow generation | CPU fallback. | Compile llama-cpp-python with CUDA for GPU acceleration.|
🗒️ Notes
- Requires ComfyUI ≥ 0.3.66.
- Tested on Python 3.12 and Windows 11.
- GPU recommended for large GGUFs.
- Original model licensing applies — this node only provides an interface.
📜 License
Refer to the original model and llama.cpp licenses.
This node does not modify or supersede any upstream licensing.