ComfyUI Extension: ComfyUI-ELLA

Authored by TencentQQGYLab

Created 2 years ago

Updated about a year ago

384 stars

ComfyUI implementation for a/ELLA.

Custom Nodes (12)

README

ComfyUI-ELLA

ComfyUI implementation for ELLA.

:star2: Changelog

[2024.4.30] Add a new node ELLA Text Encode to automatically concat ella and clip condition.
[2024.4.24] Upgraded ELLA Apply method. Better compatibility with the comfyui ecosystem. Refer to the method mentioned in ComfyUI_ELLA PR #25
- DEPRECATED: Apply ELLA without simgas is deprecated and it will be removed in a future version.
[2024.4.22] Fix unstable quality of image while multi-batch. Add CLIP concat (support lora trigger words now).
[2024.4.19] Documenting nodes.
[2024.4.19] Initial repo.

:pushpin: Notice

SIGMAS from node BasicScheduler or TIMESTEPS by node Set ELLA Timesteps must be the same as the KSampler settings. Because Timestep-Aware Semantic Connector (TSC), which dynamically adapts semantics features over sampling time steps, has been introduced.
If you need concat clip CONDITIONING to make LoRA trigger words effective, ELLA output CONDITIONING always needs to be linked to the conditioning_to of Conditioning (Concat) node.

:books: Example workflows

The examples directory has workflow examples. You can directly load these images as workflow into ComfyUI for use.

workflow_example

All legacy workflows was compatible. But it is deprecated and will be removed in a future version.

workflow_example_legacy

:tada: It works with controlnet!

workflow_controlnet

:tada: It works with lora trigger words by concat CLIP CONDITIONING!

:warning: NOTE again that ELLA CONDITIONING always needs to be linked to the conditioning_to of Conditioning (Concat) node.

workflow_lora

With ELLA Text Encode node, can simplify the workflow.

With the upgrade(2024.4.24), some interesting workflow can be implemented, such as using ELLA only in positive. As shown below:

workflow_lora_positive_ella_only

However, there is no guarantee that positive-only will bring better results.

Workflow with AYS.

workflow_ella_ays

AYS got with more visual details and better text-alignment, ref to paper.

| w/ AYS | w/o AYS | | :---: | :---: | | | |

And EMMA is working in progress.

:green_book: Install

Download or git clone this repository inside ComfyUI/custom_nodes/ directory. ComfyUI-ELLA requires the latest version of ComfyUI. If something doesn't work be sure to upgrade.

cd ComfyUI/custom_nodes
git clone https://github.com/TencentQQGYLab/ComfyUI-ELLA

Next install dependencies.

cd ComfyUI-ELLA
pip install -r requirements.txt

:orange_book: Models

These models must be placed in the corresponding directories under models.

Remember you can also use any custom location setting an ella & ella_encoder entry in the extra_model_paths.yaml file.

ComfyUI/models/ella, create it if not present.
- Place ELLA Models here
ComfyUI/models/ella_encoder, create it if not present.
- Place FLAN-T5 XL Text Encoder here, it should be a folder of transfomers structure with config.json

In summary, you should have the following model directory structure:

ComfyUI/models/ella/
└── ella-sd1.5-tsc-t5xl.safetensors

ComfyUI/models/ella_encoder/
└── models--google--flan-t5-xl--text_encoder
    ├── config.json
    ├── model.safetensors
    ├── special_tokens_map.json
    ├── spiece.model
    ├── tokenizer_config.json
    └── tokenizer.json

:book: Nodes reference

Nodes reference

:mag: Common promblem

XXX not implemented for 'Half'. See issue #12
AYS + Ella getting dark image generations. See issue #39
- Check if add_noise of SamplerCustom node is enabled.
- Lower the cfg of SamplerCustom node.

:memo: TODO

[ ] Support prompt weighting

:hugs: Contributors (direct & indirect)

<table> <tr> <td align="center"><a href="https://github.com/JettHu"><img src="https://avatars.githubusercontent.com/u/35261585?s=460&v=4" width="32px;" alt=""/> JettHu</a></td> <td align="center"><a href="https://github.com/budui"><img src="https://avatars.githubusercontent.com/u/16448529?s=460&v=4" width="32px;" alt=""/> budui</a></td> <td align="center"><a href="https://github.com/kijai"><img src="https://avatars.githubusercontent.com/u/40791699?s=460&v=4" width="32px;" alt=""/> kijai</a></td> <td align="center"><a href="https://github.com/huagetai"><img src="https://avatars.githubusercontent.com/u/1137341?s=460&v=4" width="32px;" alt=""/> huagetai</a></td> </tr> </table>

:yum: Thanks

ComfyUI: https://github.com/comfyanonymous/ComfyUI
Diffusers (borrowed timestep modules): https://github.com/huggingface/diffusers

:wink: Citation

@misc{hu2024ella,
      title={ELLA: Equip Diffusion Models with LLM for Enhanced Semantic Alignment}, 
      author={Xiwei Hu and Rui Wang and Yixiao Fang and Bin Fu and Pei Cheng and Gang Yu},
      year={2024},
      eprint={2403.05135},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}