ComfyUI Extension: Eden.art LoRa Trainer

Authored by edenartlab

Created 2 years ago

Updated 5 months ago

63 stars

Maintained by Eden.art, this is a very fast, well tuned trainer for SDXL and SD15

Custom Nodes (0)

README

Trainer

This trainer was developed by the Eden team, you can try our hosted version of the trainer in our app. It's a highly optimized trainer that can be used for both full finetuning and training LoRa modules on top of Stable Diffusion. It uses a single training script and loss module that works for both SDv15 and SDXL!

The outputs of this trainer are fully compatible with ComfyUI and AUTO111, see documentation here. A full guide on training can be found in our docs.

Training images: <img src="assets/xander_training_images.jpg" alt="Image 1" style="width:80%;"/> Generated imgs with trained LoRa: <img src="assets/xander_generated_images.jpg" alt="Image 2" style="width:80%;"/>

The trainer can be run in 4 different ways:

as a hosted service on our website
as a hosted service through replicate
as a ComfyUI node
as a standalone python script

Using in ComfyUI:

Example workflows for how to run the trainer and do inference with it can be found in /ComfyUI_workflows
Importantly this trainer uses a chatgpt call to cleanup the auto-generated prompts and inject the trainable token, this will only work if you have a .env file containing your OPENAI key in the root of the repo dir that contains a single line: OPENAI_API_KEY=your_key_string Everything will work without this, but results will be better if you set this up, especially for 'face' and 'object' modes.

The trainer supports 3 default modes:

style: used for learning the aesthetic style of a collection of images.
face: used for learning a specific face (can be human, character, ...).
object: will learn a specific object or thing featured in the training images.

Style training example: <img src="assets/style_training_example.jpg" alt="Image 1" style="width:80%;"/>

Setup

Install all dependencies using

pip install -r requirements.txt

then you can simply run:

python main.py train_configs/training_args.json to start a training job.

Adjust the arguments inside training_args.json to setup a custom training job.

You can also run this through Replicate using cog (~docker image):

Install Replicate 'cog':

sudo curl -o /usr/local/bin/cog -L "https://github.com/replicate/cog/releases/latest/download/cog_$(uname -s)_$(uname -m)"
sudo chmod +x /usr/local/bin/cog

Build the image with cog build
Run a training run with sh cog_test_train.sh
You can also go into the container with cog run /bin/bash

Full unet finetuning

When running this trainer in native python, you can also perform full unet finetuning using something like (adjust to your needs) python main.py train_configs/full_finetuning_example.json

TODO's

Bugs:

pure textual inversion for SD15 does not seem to work well... (but it works amazingly well for SDXL...) ---> if anyone can figure this one out I'd be forever grateful!
figure out why training is 3x slower through comfyui node versus just running main.py as a python job..?
Fix aspect_ratio bucketing in the dataloader (see https://github.com/kohya-ss/sd-scripts)

Bigger improvements:

integrate Flux / SD3
Add multi-concept training (multiple things represented by multiple tokens, trained into a single LoRa)
add stronger token regularization (eg CelebBasis spanning basis)
implement perfusion ideas (key locking with superclass): https://research.nvidia.com/labs/par/Perfusion/
implement prompt-aligned: https://prompt-aligned.github.io/