ComfyUI Extension: ComfyUI_StoryDiffusion

Authored by smthemex

Created

Updated

384 stars

you can using sotry-diffusion in comfyui

README

<h1> ComfyUI_StoryDiffusion</h1>

Using StoryDiffusion and other methods to make storys in ComfyUI

Updates:

  • 2025/04/14
  • 利用uno的功能来实现flux流程的双角色同框,prompt示例见图;
  • 修复ms-diffusion的双角色提示词错误,使用ms diffusion 角色提示词应该是 [A] a (man)... ,[B] a (woman)...,场景提示词不用改,还是[A] ...[B]...在同一句里时开启;
  • Use the function of UNO to realize the dual roles of the FLUX process in the same frame, the prompt example is shown in the figure;
  • Fixed the error of the dual role prompt words of ms-diffusion, the role prompts of ms diffusion should be [A] a (man)... ,[B] a (woman)..., the scene prompts do not need to be changed, or [A] ... [B]... in the same sentence;

previous

  • Add UNO support,Only the single FLUX model (27G) and UNO's Lora are needed. Please enable FP8 quantization and use storydiffusionw_flowjson workflow testing ,fix a bug,
  • 新增UNO支持,只需要单体FLUX模型(27G)和UNO的lora,请开启fp8量化和使用storydiffusion_workflow.json工作流测试,修复tokens过长的bug;
  • Add infinite svdq v0.2 support,it'work well when your svdq update v0.2,download wheel 更新 svdq v0.2的支持,infinite工作正常,轮子下载地址。
  • 1.修改了模型加载的流程,更新到V2版本,如果你喜欢旧的,可以下载V1.0版本的,2.请使用storydiffusion_workflow.json,它集成了主要的工作流;3.剔除掉一些过时的功能;
  • 1.Modified the model loading process.Update to V2 version, If you like the old one, you can download version 1.0,2.Please use 'storydiffusion_workflow.json', which integrates the main workflow,3.Remove some outdated features;

1.Installation

In the ./ComfyUI /custom_node directory, run the following:

git clone https://github.com/smthemex/ComfyUI_StoryDiffusion.git

2.requirements

pip install -r requirements.txt
  • 使用story(photomaker V2),pulid-flux,kolor,story-maker,infiniteyou,时需要安装insightface库。if using story(photomaker V2),pulid-flux,kolor,story-make,infiniteyou:
pip install insightface
  • If the module is missing, please pip install,缺什么库就装什么。

3 models

3.1 stroy _diffusion mode (单纯故事)

  • 3.1.1 any sdxl checkpoints 任意SDXL单体模型
├── ComfyUI/models/checkpoints/
|             ├── juggernautXL_v8Rundiffusion.safetensors
├── ComfyUI/models/photomaker/
|             ├── photomaker-v1.bin or photomaker-v2.bin

3.2 MS-diffusion mode(2 role in 1 imag 双角色同框)

├── ComfyUI/models/
|             ├── photomaker/ms_adapter.bin
|             ├── clip_vision/clip_vision_g.safetensors(2.35G) or CLIP-ViT-bigG-14-laion2B-39B-b160k.safetensors(3.43G)
  • 3.2.2 用cn则需要对应的cn的controlnet模型。 if using controlnet in ms-diffusion(Control_img image preprocessing, please use other nodes );
├── ComfyUI/models/controlnet/   
|     ├──xinsir/controlnet-openpose-sdxl-1.0    
|     ├──... 其他类似的

3.3 kolors face mode(不再支持IP,已修复高版本错误)

├── ComfyUI/models
|             ├── /photomaker/ipa-faceid-plus.bin
|             ├── clip/chatglm3-8bit.safetensors
|             ├── clip_vision/clip-vit-large-patch14.safetensors  # Kolors-IP-Adapter-Plus or Kolors-IP-Adapter-FaceID-Plus using same checkpoints. 
  • kolors的repo文件结构
├── any path/Kwai-Kolors/Kolors
|      ├──model_index.json
|      ├──vae
|          ├── config.json
|          ├── diffusion_pytorch_model.safetensors (rename from diffusion_pytorch_model.fp16.safetensors )
|      ├──unet
|          ├── config.json
|          ├── diffusion_pytorch_model.safetensors (rename from diffusion_pytorch_model.fp16.safetensors )
|      ├──tokenizer
|          ├── tokenization_chatglm.py ##新版,修复高版本diffuser错误
|          ├── ... #all 所有文件
|       ├── text_encoder
|          ├── modeling_chatglm.py #新版,修复高版本diffuser错误
|          ├── tokenization_chatglm.py ##新版,修复高版本diffuser错误
|          ├── ... #all 所有文件
|       ├── scheduler
|          ├── scheduler_config.json

3.4 flux_pulid mode .

  • torch must > 0.24.0 optimum-quanto must >=0.2.4
pip install -U optimum-quanto 
├── ComfyUI/models/
|             ├── photomaker/pulid_flux_v0.9.0.safetensors
|             ├── clip_vision/EVA02_CLIP_L_336_psz14_s6B.pt
|             ├── diffusion_models/flux1-dev-fp8.safetensors
├── ComfyUI/models/clip/
|             ├── t5xxl_fp8_e4m3fn.safetensors
|             ├── clip_l.safetensors

3.5 storymake mode
下载 download mask.bin#可以自动下载 buffalo_l#自动下载 RMBG-1.4#自动下载

├── ComfyUI/models/
|         ├── photomaker/mask.bin
|         ├── clip_vision/clip_vision_H.safetensors  #2.4G base in laion/CLIP-ViT-H-14-laion2B-s32B-b79K
├── ComfyUI/models/buffalo_l/
|         ├── 1k3d68.onnx
|         ├── ...

3.6 InfiniteYou mode

  • 3.6.1 flux transformer repo or kj fp8
├── any_path/FLUX.1-dev/transformer
|          ├── config.json
|          ├──diffusion_pytorch_model-00001-of-00003.safetensors
|          ├──diffusion_pytorch_model-00002-of-00003.safetensors
|          ├──diffusion_pytorch_model-00003-of-00003.safetensors
|          ├── diffusion_pytorch_model.safetensors.index.json

or

├── ComfyUI/models/
|             ├── diffusion_models/flux1-dev-fp8.safetensors #
  • 3.6.2 infinite controlnet from here ,you can use sim_stage1 or aes_stage2,必要模型,repo格式
├── any_path/sim_stage1/
|         ├── image_proj_model.bin
|         ├── InfuseNetModel/
|             ├── diffusion_pytorch_model-00001-of-00002.safetensors
|             ├── diffusion_pytorch_model-00002-of-00002.safetensors
|             ├── diffusion_pytorch_model.safetensors.index.json
|             ├── config.json

or

├── any_path/aes_stage2/
|         ├── ...
  • 3.6.3 lora optional from here
  • 3.6.4 insightface
├── ComfyUI/models/antelopev2/   
|     ├──1k3d68.onnx  
|     ├──...
  • 3.6.5 recognition_arcface_ir_se50.pth from here auto download,which embeded comfyui in "Lib\site-packages\facexlib\weights" dir
  • 3.6.6 if use gguf quatization (optional) download gguf from here,and fill local path in 'easyfunction_lite' node's 'select_method'
├── ComfyUI/models/gguf
|         ├── flux1-dev-Q8_0.gguf  #flux1-dev-Q6_K.gguf
  • 3.6.7 if use svdquant(optional) download svdquant repo from here and fill local path in 'easyfunction_lite' node's 'select_method'

3.7 UNO mode download lora dit_lora.safetensor,use fp8,if Vram <24.

├── ComfyUI/models/
|             ├── diffusion_models/flux1-dev.safetensors  #
|             ├── loras/dit_lora.safetensors # 

4 Example

4.1 story-diffusion

  • txt2img 文生图示例 <img src="https://github.com/smthemex/ComfyUI_StoryDiffusion/blob/main/images/storytxt2img.png" width="50%">
  • img2img 图生图示例 <img src="https://github.com/smthemex/ComfyUI_StoryDiffusion/blob/main/images/storytxt2imgv1orv2.png" width="50%">

4.2 ms-diffusion

  • txt2img 文生图 双角色同框 <img src="https://github.com/smthemex/ComfyUI_StoryDiffusion/blob/main/images/msdiffusion_txt2img_2role1img.png" width="50%">
  • img2img 图生图 双角色同框 <img src="https://github.com/smthemex/ComfyUI_StoryDiffusion/blob/main/images/msdiffusion_img2img_2role1img.png" width="50%">

4.3 story-maker or story-and-maker

  • story-and-maker <img src="https://github.com/smthemex/ComfyUI_StoryDiffusion/blob/main/images/storyandmaker_img2img.png" width="50%">
<img src="https://github.com/smthemex/ComfyUI_StoryDiffusion/blob/main/images/storyandmaker_txt2img.png" width="50%"> * story-maker <img src="https://github.com/smthemex/ComfyUI_StoryDiffusion/blob/main/images/maker_image2image.png" width="50%">

4.4 consistory

  • only one role 只支持单角色 use example.json

4.5 kolor-face

  • img2img kolor face,图生图 <img src="https://github.com/smthemex/ComfyUI_StoryDiffusion/blob/main/images/kolor_face.png" width="50%">

4.6 pulid-flux

  • 注意示例图片的repo模式已取消,使用 example.json的流程 <img src="https://github.com/smthemex/ComfyUI_StoryDiffusion/blob/main/images/Flux_PulID.png" width="50%">

4.7 infiniteyou

  • repo nf4 注意节点有修改,按example.json的流程 <img src="https://github.com/smthemex/ComfyUI_StoryDiffusion/blob/main/images/infiniteyou.png" width="50%">
  • gguf <img src="https://github.com/smthemex/ComfyUI_StoryDiffusion/blob/main/images/infinite_gguf.png" width="50%">
  • svdq,升级到v.2工作正常 <img src="https://github.com/smthemex/ComfyUI_StoryDiffusion/blob/main/images/infinite_svdqv2.png" width="50%">

4.8 UNO
<img src="https://github.com/smthemex/ComfyUI_StoryDiffusion/blob/main/images/UNO_N.png" width="50%">

  • dual 双角色同框示例 <img src="https://github.com/smthemex/ComfyUI_StoryDiffusion/blob/main/images/example_uno_dual.png" width="50%">

4.9 comfyUI classic(comfyUI经典模式,可以接任意适配CF的流程,主要是方便使用多角色的clip)

  • any mode SD1.5 SDXL SD3.5 FLUX... <img src="https://github.com/smthemex/ComfyUI_StoryDiffusion/blob/main/images/comfyui_classic.png" width="50%">

5 Citation

StoryDiffusion

@article{zhou2024storydiffusion,
  title={StoryDiffusion: Consistent Self-Attention for Long-Range Image and Video Generation},
  author={Zhou, Yupeng and Zhou, Daquan and Cheng, Ming-Ming and Feng, Jiashi and Hou, Qibin},
  journal={arXiv preprint arXiv:2405.01434},
  year={2024}
}

IP-Adapter

@article{ye2023ip-adapter,
  title={IP-Adapter: Text Compatible Image Prompt Adapter for Text-to-Image Diffusion Models},
  author={Ye, Hu and Zhang, Jun and Liu, Sibo and Han, Xiao and Yang, Wei},
  booktitle={arXiv preprint arxiv:2308.06721},
  year={2023}
}

MS-Diffusion

@misc{wang2024msdiffusion,
  title={MS-Diffusion: Multi-subject Zero-shot Image Personalization with Layout Guidance}, 
  author={X. Wang and Siming Fu and Qihan Huang and Wanggui He and Hao Jiang},
  year={2024},
  eprint={2406.07209},
  archivePrefix={arXiv},
  primaryClass={cs.CV}
}

photomaker

@inproceedings{li2023photomaker,
  title={PhotoMaker: Customizing Realistic Human Photos via Stacked ID Embedding},
  author={Li, Zhen and Cao, Mingdeng and Wang, Xintao and Qi, Zhongang and Cheng, Ming-Ming and Shan, Ying},
  booktitle={IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2024}
}

kolors

@article{kolors,
  title={Kolors: Effective Training of Diffusion Model for Photorealistic Text-to-Image Synthesis},
  author={Kolors Team},
  journal={arXiv preprint},
  year={2024}
}

PuLID

@article{guo2024pulid,
  title={PuLID: Pure and Lightning ID Customization via Contrastive Alignment},
  author={Guo, Zinan and Wu, Yanze and Chen, Zhuowei and Chen, Lang and He, Qian},
  journal={arXiv preprint arXiv:2404.16022},
  year={2024}
}

Consistory

@article{tewel2024training,
  title={Training-free consistent text-to-image generation},
  author={Tewel, Yoad and Kaduri, Omri and Gal, Rinon and Kasten, Yoni and Wolf, Lior and Chechik, Gal and Atzmon, Yuval},
  journal={ACM Transactions on Graphics (TOG)},
  volume={43},
  number={4},
  pages={1--18},
  year={2024},
  publisher={ACM New York, NY, USA}
}

infiniteyou

@article{jiang2025infiniteyou,
  title={{InfiniteYou}: Flexible Photo Recrafting While Preserving Your Identity},
  author={Jiang, Liming and Yan, Qing and Jia, Yumin and Liu, Zichuan and Kang, Hao and Lu, Xin},
  journal={arXiv preprint},
  volume={arXiv:2503.16418},
  year={2025}
}

svdquant

@inproceedings{
  li2024svdquant,
  title={SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models},
  author={Li*, Muyang and Lin*, Yujun and Zhang*, Zhekai and Cai, Tianle and Li, Xiuyu and Guo, Junxian and Xie, Enze and Meng, Chenlin and Zhu, Jun-Yan and Han, Song},
  booktitle={The Thirteenth International Conference on Learning Representations},
  year={2025}
}

GGUF FLUX LICENSE

@article{wu2025less,
  title={Less-to-More Generalization: Unlocking More Controllability by In-Context Generation},
  author={Wu, Shaojin and Huang, Mengqi and Wu, Wenxu and Cheng, Yufeng and Ding, Fei and He, Qian},
  journal={arXiv preprint arXiv:2504.02160},
  year={2025}
}