ComfyUI Extension: ComfyUI-DAViD
An implementation of the DAViD tooling, a method for extracting depth, normals, and masks from an input image.
Custom Nodes (0)
README
ComfyUI-DAViD
Custom nodes for DAViD (Data-efficient and Accurate Vision Models from Synthetic Data) models in ComfyUI. These nodes enable depth estimation, surface normal estimation, and soft foreground segmentation for human-centric images.
PLEASE NOTE: AS CONFIRMED BY THE RESEARCH TEAM THIS WAS NOT TRAINED TO BE TEMPORALLY STABLE ACROSS VIDEO FRAMES.
🌟 Features
- Multi-Task Processing: Get depth, normal, and foreground masks in a single node
- High Quality: State-of-the-art results for human subjects
- GPU Accelerated: Full ONNX Runtime GPU support
- Batch Processing: Process multiple images efficiently
- Flexible Outputs: Choose between raw outputs or visualization-ready formats
📦 Installation
Method 1: ComfyUI Manager (NOT ADDED YET - USE METHOD 2 FOR NOW)
- Install ComfyUI Manager if you haven't already
- Search for "DAViD" in the ComfyUI Manager
- Click Install
Method 2: Manual Installation
-
Navigate to your ComfyUI custom nodes directory:
cd ComfyUI/custom_nodes/
-
Clone this repository:
git clone https://github.com/your-username/ComfyUI-DAViD.git
-
Install dependencies (BEFORE DOING THIS TRY LAUNCHING COMFY AND SEEING WHICH DEPENDENCIES YOU ARE MISSING FROM THE TERMINAL OUTPUT. INSTALL ONLY THE ONES REQUIRED):
cd ComfyUI-DAViD pip install -r requirements.txt
📥 Model Download
The DAViD models need to be downloaded separately:
-
Create the models directory:
mkdir -p ComfyUI-DAViD/models/david
-
Download the multi-task model:
# Multi-task model (recommended - all three tasks in one model) wget https://facesyntheticspubwedata.z6.web.core.windows.net/iccv-2025/models/multi-task-model-vitl16_384.onnx -O ComfyUI-DAViD/models/david/multitask-vitl16_384.onnx
-
(Optional) Download individual task models:
# Depth estimation models wget https://facesyntheticspubwedata.z6.web.core.windows.net/iccv-2025/models/depth-model-vitb16_384.onnx -O ComfyUI-DAViD/models/david/depth-vitb16_384.onnx wget https://facesyntheticspubwedata.z6.web.core.windows.net/iccv-2025/models/depth-model-vitl16_384.onnx -O ComfyUI-DAViD/models/david/depth-vitl16_384.onnx # Surface normal models wget https://facesyntheticspubwedata.z6.web.core.windows.net/iccv-2025/models/normal-model-vitb16_384.onnx -O ComfyUI-DAViD/models/david/normal-vitb16_384.onnx wget https://facesyntheticspubwedata.z6.web.core.windows.net/iccv-2025/models/normal-model-vitl16_384.onnx -O ComfyUI-DAViD/models/david/normal-vitl16_384.onnx # Foreground segmentation models wget https://facesyntheticspubwedata.z6.web.core.windows.net/iccv-2025/models/foreground-segmentation-model-vitb16_384.onnx -O ComfyUI-DAViD/models/david/foreground-vitb16_384.onnx wget https://facesyntheticspubwedata.z6.web.core.windows.net/iccv-2025/models/foreground-segmentation-model-vitl16_384.onnx -O ComfyUI-DAViD/models/david/foreground-vitl16_384.onnx
🚀 Usage
DAViD Multi-Task Node
The main node that performs all three tasks in a single inference:
Inputs:
image
: Input image (RGB)model_name
: Model to use (default: multitask-vitl16_384.onnx)inverse_depth
: Invert depth values (closer = higher)binarize_foreground
: Convert soft mask to binaryforeground_threshold
: Threshold for binarization (0.0-1.0)
Outputs:
depth_map
: Colored depth visualization (TURBO colormap)normal_map
: Surface normal map (RGB visualization)foreground_rgb
: Foreground mask as RGB imageforeground_mask
: Raw foreground mask (single channel)
Example Workflows
Basic Human Processing
Load Image → DAViD Multi-Task → Save Image (depth)
→ Save Image (normal)
→ Save Image (foreground)
Background Replacement
Load Image → DAViD Multi-Task →
↓ (foreground_mask)
→ Image Composite (with new background) → Save Image
Depth-based Effects
Load Image → DAViD Multi-Task →
↓ (depth_map)
→ Depth Blur → Save Image
🎯 Use Cases
- Portrait Enhancement: Extract clean foreground masks for background replacement
- 3D Effects: Use depth maps for bokeh, fog, or depth-of-field effects
- Relighting: Apply new lighting using surface normals
- Virtual Production: Green screen alternative using AI segmentation
- AR/VR: Depth and normal data for 3D reconstruction
- Style Transfer: Use masks to apply effects selectively
🛠️ Troubleshooting
ONNX Runtime Issues
If you encounter ONNX Runtime errors:
pip uninstall onnxruntime onnxruntime-gpu
pip install onnxruntime-gpu==1.16.3
CUDA/GPU Issues
Ensure CUDA is properly installed and matches your PyTorch version:
import torch
print(torch.cuda.is_available()) # Should return True
Model Not Found
Ensure models are in the correct directory:
ComfyUI-DAViD/models/david/multitask-vitl16_384.onnx
🙏 Acknowledgments
- Original DAViD paper and models by Microsoft Research
- ComfyUI framework by comfyanonymous
📄 License
This custom node implementation is licensed under MIT License. The DAViD models are licensed under their respective licenses (see original repository).
🔗 Links
📝 Citation
If you use these nodes in your research, please cite:
@misc{saleh2025david,
title={{DAViD}: Data-efficient and Accurate Vision Models from Synthetic Data},
author={Fatemeh Saleh and others},
year={2025},
eprint={2507.15365},
archivePrefix={arXiv},
primaryClass={cs.CV}
}