ComfyUI Extension: ComfyUI-DepthAnythingV3

Authored by PozzettiAndrea

Created

Updated

295 stars

ComfyUI nodes to use Depth Anything V3 - the latest depth estimation model from ByteDance. For now monocular depth, camera pose estimation and 3D point clouds/gaussians are supported. Models autodownload from HuggingFace (depth-anything org). This is a first draft, let me know if you have any feedback! :)

README

ComfyUI Depth Anything V3

Custom nodes for Depth Anything V3 integration with ComfyUI.

Simple workflow: simple

Advanced workflow: advanced

Single image to 3d: 3d

Multiple image to 3d: 3d_multiview

Single image to mesh: bas_relief_wf

Use multi attention node for smooth video depth! video

Demo Videos

You can use the multi-view node to use the cross attention feature of the main class of models. This is done to have a more consistent depth across frames of a video.

https://github.com/user-attachments/assets/058bd968-aae3-4759-887c-4c98132559f0

You can reconstruct 3D point clouds!

https://github.com/user-attachments/assets/8ef6e74b-c7c7-41e7-b1c6-de44733e6c61

Even from multiple views, with the option to either match them (with icp) or leave them to use the predicted camera positions. You also have a field on the point cloud to show you which view each point came from.

https://github.com/user-attachments/assets/6892313d-bcd8-44ec-9038-7d4d8915f59e

Description

Depth Anything V3 is the latest depth estimation model that predicts spatially consistent geometry from visual inputs.

Published: November 14, 2025 Paper: Depth Anything 3: Recovering the Visual Space from Any Views

Nodes

Model Loading

Download And Load Depth Anything V3 Model - Downloads and loads model from HuggingFace

  • Inputs: model (variant), precision (auto/fp16/fp32/bf16)
  • Output: da3_model
  • Models auto-download to ComfyUI/models/depthanything3/

DA3 Enable Tiled Processing - Configure model for high-resolution images

  • Inputs: da3_model, tile_size, overlap
  • Output: da3_model (with tiled config)
  • Use for 4K+ images to avoid OOM errors

Inference Nodes

Depth Anything V3 - Unified depth estimation with multiple normalization modes

  • Inputs: da3_model, images, normalization_mode, camera_params (optional), resize_method, invert_depth, keep_model_size
  • Outputs: depth, confidence, ray_origin, ray_direction, extrinsics, intrinsics, sky_mask
  • Normalization Modes:
    • Standard: Original V3 min-max normalization (0-1 range, includes sky in normalization)
    • V2-Style (default): Disparity-based with content-aware contrast - best for ControlNet workflows
      • Sky appears BLACK (like Depth Anything V2)
      • Content-only normalization excludes sky from range calculation
      • Percentile-based contrast enhancement (1st-99th percentile)
      • Enhanced depth gradations via contrast boost
      • Subtle edge anti-aliasing for natural transitions
      • Contribution by Ltamann (TBG)
    • Raw: No normalization, outputs metric depth values for 3D reconstruction/point clouds

Note: The unified node always outputs all available data. Connect only the outputs you need - unused outputs are simply ignored.

Depth Anything V3 (Multi-View) - Process multiple images with cross-view attention

  • Inputs: da3_model, images (batch), normalization_mode, resize_method, invert_depth
  • Outputs: depth, confidence, ray_origin, ray_direction, extrinsics, intrinsics, sky_mask
  • Normalization modes: Standard, V2-Style (default), Raw
  • Use for video frames or multiple angles of same scene
  • Normalizes depth across all views together for temporal/spatial consistency

3D Processing

DA3 to Point Cloud - Convert depth to point cloud using geometric unprojection

  • Inputs: depth_raw, confidence, sky_mask (optional but recommended), intrinsics (optional), source_image (optional)
  • Output: pointcloud
  • Sky filtering: When sky_mask is connected, sky pixels are automatically excluded from the point cloud
  • Input validation: Automatically detects and rejects normalized depth inputs (requires Raw mode)

DA3 to Point Cloud (Multi view) - Convert depth to point cloud using geometric unprojection

DA3 Save Point Cloud - Export to PLY format

DA3 to 3D Gaussians - Extract 3D Gaussian splats (placeholder)

DA3 Save 3D Gaussians - Export Gaussians to PLY

Camera Utilities

DA3 Create Camera Parameters - Create camera conditioning input

DA3 Parse Camera Pose - Parse camera parameters from JSON

Model Variants

| Model | Size | Features | |-------|------|----------| | DA3-Small | 80M | Fast, good quality | | DA3-Base | 220M | Balanced quality and speed | | DA3-Large | 350M | High quality, balanced | | DA3-Giant | 1.15B | Best quality, slower | | DA3Mono-Large | 350M | Optimized for monocular depth | | DA3Metric-Large | 350M | Metric depth estimation | | DA3Nested-Giant-Large | 1.4B | Combined model with metric scaling |

Model Capabilities

Different models support different features:

| Feature | Small/Base/Large/Giant | Mono-Large | Metric-Large | Nested | |---------|------------------------|------------|--------------|--------| | Sky Segmentation | ❌ | ✅ | ✅ | ✅ | | Camera Conditioning | ✅ | ❌ | ❌ | ✅ | | Multi-View Attention | ✅ | ⚠️ | ⚠️ | ✅ | | 3D Gaussians | ✅* | ❌ | ❌ | ✅* | | Ray Maps | ✅ | ❌ | ❌ | ✅ |

  • ✅ = Fully supported
  • ❌ = Not available (returns zeros/ignored)
  • ⚠️ = Works but no cross-view attention benefit (images processed independently)
  • ✅* = Requires fine-tuned model weights (placeholder in current release)

Choose your model based on needs:

  • Need sky masks? → Use Mono/Metric/Nested (required for V2-Style normalization)
  • Need camera conditioning? → Use Main series or Nested
  • Processing video/multi-view? → Use Main series or Nested for consistency
  • Single images only? → Any model works

Workflow Tips

For ControlNet Depth Workflows

  1. Use Mono or Metric models (they provide sky segmentation)
  2. Set normalization_mode to V2-Style (default)
  3. Connect the depth output to your ControlNet node
  4. Enjoy clean depth maps with proper sky handling!

For 3D Reconstruction (Point Clouds)

  1. Use any model (Mono/Metric recommended for sky filtering)
  2. Set normalization_mode to Raw
  3. Connect depthdepth_raw, confidenceconfidence, sky_masksky_mask to DA3 to Point Cloud
  4. Sky pixels will be automatically excluded if sky_mask is connected
  5. Important: Point cloud nodes validate input and will raise an error if normalized depth is detected (prevents incorrect 3D output)

Community

Questions or feature requests? Open a Discussion on GitHub.

Join the Comfy3D Discord for help, updates, and chat about 3D workflows in ComfyUI.

Credits

License

Model architecture files based on Depth Anything 3 (Apache 2.0 / CC BY-NC 4.0 depending on model).

Note: Some models (Giant, Nested) use CC BY-NC 4.0 license (non-commercial use only).