ComfyUI Extension: ComfyUI_Wan2_1_lora_trainer
ComfyUI interface adaptation of the musubi-tunner library to train Wan 2.1 loras.
Custom Nodes (0)
README
Important note: The sprites used for the training tests are property of real artirsts (under copyright), so this is not only a copy of the world, it can be a tool for artists too.
ā LORA STRENGTH FIXED!.
Tested and working with this package:
- ComfyUI 0.3.39
- Python 3.12.10
- Torch 2.7.0 cuda 128
Tested and confirmed the already working configurations:
- Regular Run with SPDA attention.
- TORCH compile (extra nodes) with SPDA attention.
Non valid Lora's configuration:
- Regular run and Sage attention.
- TORCH AND SAGE attention.
Strongly recomended to use SPDA attention to get a valid LORA. Why? see the next point below:
-
Using SPDA Is the only one making a valid Lora for the moment... Setting up SAGE as attention just make the Lora generate noise, so i need to make more proofs to see if there is an issue or its just an incompatibility training with Sage (I don't know if Sage only works by generating and not for training...must to be investigated) or something related about tweaking other kind of parameters as cpu threads (default is 1 for threads and 2 for n workers). I will update if i found new info about it, but if you have any relevant data about it don't hesitate to let me know.
-
Regular run: If you use regular bat you must to bypass compiler an memory settings, enough for 1.3B models. (attention mode in spda, default parameters already configured for inmediate results)
Update version 1.02:
- Setted up max_train_epochs to 512.
Update version 1.01:
- Fixed as default the learning parameters (A good starting point to see quickly results just loading the nodes/workflow).
- Added number of cpu per process argument.
- Added max_train_epochs argument to avoid the internal step limit of 2048.
- Fixed scale weight norms.
- Updated example workflow.
- Updated pictures.
- Delete your custom folder and clone again.
Features:
-
ComfyUi lora trainer.
-
Adaptation of the musubi-tunner library by kohya-ss. From https://github.com/kohya-ss/musubi-tuner.
-
Added train into subprocess.
-
Code mods for full compatibility with the last version of ComfyUi.
-
Added 6 nodes for training.
-
Added Example Workflow in the custom node folder.
-
Made for ComfyUi windows portable package.
-
Automated process for the creation of the TOM file, cache latents, cache texts & trainning run (Nodes triggered by this order to do the complete process in one shot).
-
You can skip latents and text caches if you need to restart the training (taking into account data has not changed vae, clipvision, text models).
-
For more info about setting up correctly parameters you can check the Wan Doc on https://github.com/kohya-ss/musubi-tuner/blob/main/docs/wan.md
-
avr_loss in last update looking great (128 epochs, 30 images + 5000 steps, network dropout 0.0, other settings as home):
-
-
Deconstructed character in pieces:
-
-
Result (text to video generation):
Also playing with image sequences :
Result with 0.5 of strenght:
About max_train_epochs: The max train epochs is an equations that takes into account several arguments as gradients, number of images etc. This must be set up between 16/512 depending of the number of images you want to train. To ensure a little package of 30 images, set up it as 128 to train more than 5000 steps. Take into account network dropout to not overfitting, also dim and alpha. All is relative but for sure you will find you custom setting depending of your purpose. For the moment max train epochs have a limit to 512, but if is needed to add a bigger max value i can update it.
INSTRUCTIONS:
- Clone this repository into your custom node folder.
- install requirements.txt from custom_nodes\ComfyUI_Wan2_1_lora_trainer :
..\..\..\python -m pip install -r requirements.txt
- run comfyUi
- Enjoy training.
INSTRUCTIONS TO USE TORCH NODES AND BLOCK SWAP (Only compatible with SPDA attention) : If you want to use TORCH or your instalation does not detect the CL tools from visual studio. CREATE a custom bat adding a call pointing to your build tools (Visual studio build tools need to be installed) Example:
@echo off
REM Load Visual Studio Build Tools for the Wan subprocess environment (Optional to use max pow with the musubi compile settings and memory nodes)
call "C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Auxiliary\Build\vcvarsall.bat" amd64
REM Start ComfyUI
.\python_embeded\python.exe -s ComfyUI\main.py --windows-standalone-build
pause
NOTE : The reason of adding this windows call is because the Trainer runs in a new sub process inheriting the ComfyUI environment, but needs its own Visual Studio environment to work.
CLIP VISION : Clip vision is just setted up for I2V models, for training T2V Models, set clip to None.
Then conect the compiler and memory nodes (choose your desired attention mode):
if you don't have any of this modules you can disconnect the musubi compile settings node.
-
Image data input are not exclusive to videos! you can train just with images as the following example (path to your images and text captions):
-
Path into an empty folder for the cache (use different folders for each lora to not mix you cache data (cleaner and probably faster).
Performances test with 312 images with default settings (spda) :
And the results :
https://github.com/user-attachments/assets/41b439ee-6e90-49ac-83dd-d1ba21fd1d63
For any concern no doubt to contact me.