ComfyUI Extension: ComfyUI-WorkflowGenerator
ComfyUI-WorkflowGenerator custom nodes for generating ComfyUI workflows from natural language
Custom Nodes (0)
README
ComfyUI Workflow Generator
Generate ComfyUI workflows from natural language descriptions using Large Language Models (LLMs).
This custom node enables users to describe a workflow in natural language (e.g., "Create a text-to-image workflow using SDXL") and automatically constructs the corresponding node graph. It leverages Large Language Models (LLMs) to bridge the gap between intent and execution.
Implementation Note
This project is an independent implementation of the ComfyGPT research, designed to bring that architecture directly into ComfyUI as a native node suite.
My goal was to preserve the core functionality of the original research—specifically the multi-stage pipeline of generation, validation, and construction, while optimizing it for the local ComfyUI environment. This implementation focuses on modularity, allowing users to inspect and intervene at each stage of the generation process.
Current Limitations: The system's knowledge is bounded by its training data. While it excels at standard patterns and known nodes, it cannot inherently "know" about custom nodes released after its training cutoff without additional context. It acts as a powerful accelerator for workflow creation, but supervision is recommended.
Based on ComfyGPT Research
Title: "ComfyGPT: A Self-Optimizing Multi-Agent System for Comprehensive ComfyUI Workflow Generation"
- Original Repository: https://github.com/comfygpt/comfygpt
- Project Website: https://comfygpt.github.io/
How It Works
This implementation uses a specialized LLM fine-tuned on workflow data to execute a three-stage pipeline:
- Generator: Interprets the natural language prompt to generate a logical graph structure (JSON).
- Validator: Verifies node names against the local installation or semantic embeddings to ensure compatibility.
- Builder: Compiles the validated structure into an executable ComfyUI workflow format.
Model Sources
The models used in this implementation are based on the original ComfyGPT research:
-
WorkflowGenerator Model: Original fine-tuned model from xiatianzs/resources (ComfyGPT research team)
- Base model: Qwen2.5-14B, fine-tuned for workflow generation
- Training context window: 8,192 tokens (cutoff length used during training - see training config)
- Model architecture context window: 131,072 tokens (128K) - Maximum supported by model architecture (see model config)
- Quantized to GGUF format (q8_0) for efficient inference
-
Embedding Model: sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2 (original SentenceTransformer model)
- Used for semantic search in NodeValidator
-
NodeValidator Model: Base Qwen2.5-7B-Instruct model (not fine-tuned)
- Context window: 32,768 tokens
- Used for LLM refinement (optional)
- Quantized to GGUF format (q8_0) for efficient inference
Model Repositories:
- Original Models: xiatianzs/resources - Original fine-tuned models from ComfyGPT research team (HuggingFace format)
- Pre-quantized GGUF Models: DanielPFlorian/comfyui-workflowgenerator-models - Ready-to-use quantized GGUF models for this implementation
Installation
Prerequisites
- ComfyUI installed and running
- Python 3.10 or higher (see note below)
- CUDA-capable GPU (recommended) or CPU
- Git installed (for cloning the repository)
Python Requirements:
- Standard ComfyUI Installation: Python 3.10 or higher must be installed separately. ComfyUI requires Python to run, so if ComfyUI is working, Python is already installed.
- Portable ComfyUI Installation: Python is included (embedded in the
python_embededfolder), so no separate Python installation is needed. The custom node will use ComfyUI's embedded Python.
Note: If Git is not installed, download it from git-scm.com. You can verify Git installation by running git --version in your terminal.
Model Recommendation
Model Format Comparison:
-
GGUF models: Use significantly less VRAM with similar generation quality compared to HuggingFace models. Quantization (q8_0, q4_0) provides a good balance between quality and memory usage.
-
HuggingFace models: Use more VRAM but offer full precision. Both formats are fully supported and produce similar quality results.
Step 1: Clone the Repository
Navigate to your ComfyUI custom_nodes directory and clone this repository:
cd ComfyUI/custom_nodes
git clone https://github.com/danielpflorian/ComfyUI-WorkflowGenerator.git
Step 2: Install Dependencies
Install the required Python dependencies by opening your terminal inside the ComfyUI-WorkflowGenerator folder:
cd ComfyUI/custom_nodes/ComfyUI-WorkflowGenerator
pip install -r requirements.txt
For Portable ComfyUI Installations:
For portable installations, use the embedded Python from the portable ComfyUI root directory:
cd <portable_comfyui_root>
python_embeded\python.exe -s -m pip install -r ComfyUI\custom_nodes\ComfyUI-WorkflowGenerator\requirements.txt
Step 3: Install llama-cpp-python (Required for GGUF Models)
Important: llama-cpp-python is required for GGUF models. It must be installed separately based on your system configuration.
Note: The quick install commands below may not work for all Windows/CUDA configurations. If they fail or if CUDA support isn't detected, local compilation is often necessary. See the Wiki - llama-cpp-python Installation for detailed instructions on compiling from source.
Quick install (try this first):
- CPU only:
pip install llama-cpp-python - CUDA (NVIDIA):
pip install llama-cpp-python[cuda] - Metal (macOS):
pip install llama-cpp-python[metal]
For Portable ComfyUI Installations:
For portable installations, use the embedded Python from the portable ComfyUI root directory:
cd <portable_comfyui_root>
python_embeded\python.exe -s -m pip install llama-cpp-python[cuda]
Note: If you plan to use HuggingFace models instead, you can skip this step.
Step 4: Copy Models
Copy your GGUF models and tokenizers to ComfyUI/models/LLM/:
ComfyUI/models/LLM/
├── workflow-generator-q8_0.gguf # WorkflowGenerator model
├── workflow-generator/ # WorkflowGenerator tokenizer
├── Qwen2.5-7B-Instruct-q8_0.gguf # NodeValidator model (optional)
├── Qwen2.5-7B-Instruct/ # NodeValidator tokenizer (optional)
└── paraphrase-multilingual-MiniLM-L12-v2/ # Embedding model
Step 5: Restart ComfyUI
Restart ComfyUI to load the custom nodes. The nodes will appear in the WorkflowGenerator category.
Quick Start
The easiest way to get started is using the Workflow Generator Pipeline node, which processes your instruction through all three stages sequentially:
- Install the custom node (see Installation above)
- Install llama-cpp-python (required for GGUF models - see Wiki - llama-cpp-python Installation)
- Run the "Update Node Catalog" node first - This scans and catalogs all available ComfyUI nodes (native and custom) and is required before generating workflows
- Add the "Workflow Generator Pipeline" node to your ComfyUI workflow
- Enter your instruction, for example:
"Create a text-to-image workflow" - Configure model path (see Model Format Comparison above for guidance)
- Execute the node to generate a complete workflow
- Use the generated workflow - it will appear as a workflow JSON that you can load or save
Expected Results:
- The Pipeline node automatically generates the workflow diagram, validates node names, and converts it to executable ComfyUI workflow JSON
- You can save the workflow to a file for later use. By default it will be saved in comfyUI/output.
Note: For detailed documentation on individual nodes and advanced usage, see the Wiki.
Available Nodes
This section provides a high-level overview of each node.
Workflow Generator Pipeline
- Purpose: One-click solution for complete workflow generation
- What it does: Processes your instruction through all three stages sequentially: generates the workflow diagram, validates and corrects node names, then builds the final ComfyUI workflow JSON
- Best for: Quick workflow creation without intermediate steps
- Key inputs: Instruction, model path, configuration options
- Key outputs: Complete workflow JSON, optional file save
For more control, you can use individual nodes to inspect and modify intermediate results:
Important: Before using individual nodes or the pipeline, run the "Update Node Catalog" node first to scan and catalog all available ComfyUI nodes. This is required for proper node validation and workflow building.
Note: WorkflowGenerator, NodeValidator, and WorkflowBuilder are designed to be used in sequence. NodeValidator is optional—you can connect WorkflowGenerator directly to WorkflowBuilder if you want to skip validation.
1. WorkflowGenerator
- Purpose: Generates workflow diagrams from natural language
- Key inputs: Instruction, model selection
- Key outputs: Workflow diagram (JSON string)
- Usage: First step in the pipeline
2. NodeValidator
- Purpose: Validates and corrects node names in workflow diagrams
- Key inputs: Workflow diagram, optional instruction for context
- Key outputs: Refined diagram with corrected node names
- Usage: Optional second step (can be skipped)
- Two modes:
- Semantic search only (faster, deterministic)
- LLM refinement (slower, more accurate)
3. WorkflowBuilder
- Purpose: Converts workflow diagrams into executable ComfyUI workflow JSON
- Key inputs: Refined diagram (from NodeValidator) or raw diagram (from WorkflowGenerator)
- Key outputs: Workflow JSON, optional file save
- Usage: Final step in the pipeline
UpdateNodeCatalog
- Purpose: Scans and catalogs all available ComfyUI nodes (native and custom)
- When to run: After installing new custom nodes or updating ComfyUI
- Key inputs: Catalog directory path (optional)
- Key outputs: Updated catalog files
Troubleshooting
Quick fixes:
- Models not found: Verify models are in
ComfyUI/models/LLM/ - OOM errors:
auto_gpu_layersis enabled by default to prevent this. If issues persist, verify your VRAM is sufficient for the model size. - Slow performance: Use GGUF models (lower VRAM usage), disable LLM refinement
- Invalid nodes: Run Update Node Catalog node
Architectural Insights & Future Vision
While this implementation successfully brings the ComfyGPT architecture to ComfyUI, the rapid evolution of the generative AI landscape suggests that the next generation of workflow generation tools will need to evolve beyond static fine-tuning.
The Problem with Static Models
The ComfyUI custom node ecosystem changes daily. New nodes and unforeseen architectures are released constantly. A model fine-tuned on a dataset from last month (like the current WorkflowGenerator) is inherently "frozen" in time. It cannot hallucinate the correct connections for a node it has never seen during training.
Ideas on Future Workflow Generators
To achieve true state-of-the-art performance that keeps pace with the community, future architectures should likely move toward:
-
Retrieval-Augmented Generation (RAG) for Nodes: Instead of baking node knowledge into the model weights, an agent could query a dynamic, vector-embedded database that includes both the user's currently installed nodes and an updated database of repositories. This allows the system to discover and use completely new nodes or updated versions directly from the internet, enabling it to "read" the documentation of a node released today and use it immediately.
-
Input/Output Type Awareness: The current approach uses semantic search to correct node names (e.g., matching "Load Image" to "Image Loader"). However, a robust system needs to understand I/O Schema.
- Current: "Does this node name look right?"
- Future: "Does the
LATENToutput of Node A actually fit into theLATENTinput of Node B?" - An agent needs to reason about data types (Float, Image, Conditioning, Model) to ensure generated workflows are not just linguistically plausible, but executably valid.
-
Small Graph-Reasoning Models (SLMs): We may not need massive 70B+ parameter models for this task. Specialized, smaller models trained specifically on Graph Theory and Directed Acyclic Graphs (DAGs) could offer superior reasoning capabilities for connecting logical blocks, while relying on the RAG system for the specific node vocabulary.
-
Case-Based Reasoning & Workflow Retrieval: Thousands of high-quality workflows exist online, often accompanied by images and descriptions. A future system should not just generate from scratch but actively retrieve and adapt existing workflows. By indexing these community-created workflows as "knowledge patterns," the agent can identify how experts typically solve specific sub-problems (e.g. "How is ControlNet usually connected to Wan Video Generation Nodes?") and apply those proven sub-graphs to the current task.
This project serves as a foundational step, implementing the current research (ComfyGPT). However, the future lies in agents that can actively explore, reason about, and validate the live ComfyUI environment they inhabit.
License
GNU General Public License v3
Acknowledgments
- ComfyUI: Built on top of ComfyUI
- ComfyGPT Research Team: This project is based on the ComfyGPT research paper and original repository by Oucheng Huang, Yuhang Ma, Zeng Zhao, Mingrui Wu, Jiayi Ji, Rongsheng Zhang, Zhipeng Hu, Xiaoshuai Sun, and Rongrong Ji
- llama-cpp-python: Uses llama-cpp-python for GGUF model support