ComfyUI nodes to use Depth Anything V3 - the latest depth estimation model from ByteDance. For now monocular depth, camera pose estimation and 3D point clouds/gaussians are supported. Models autodownload from HuggingFace (depth-anything org). This is a first draft, let me know if you have any feedback! :)

ComfyUI-DepthAnythingV3


Create camera parameters for conditioning DA3 depth estimation.

Provides known camera pose to improve depth estimation accuracy.

Parameters:
- cam_x/y/z: Camera position in world space
- rot_x/y/z: Camera rotation (Euler angles in degrees)
- focal_length: If > 0, uses this value. Otherwise uses fov_degrees.
- fov_degrees: Field of view in degrees (used if focal_length is 0)

Output:
- CAMERA_PARAMS: Dictionary with extrinsics (4x4) and intrinsics (3x3) matrices


DA3_CreateCameraParams


Enable tiled processing for memory-efficient inference on high-resolution images.

This node configures the model to process images in tiles with overlapping regions,
then blends the results for seamless output.

Parameters:
- tile_size: Size of each tile (should be multiple of 14 for patch alignment)
- overlap: Overlap between adjacent tiles for smooth blending

Use this when:
- Processing 4K+ resolution images
- GPU memory is limited
- Getting out-of-memory errors

Note: Tiled processing may produce slightly different results at tile boundaries,
but the overlap and blending minimize artifacts.


DA3_EnableTiledProcessing


Filter 3D Gaussians from a PLY file and save filtered result.

Connect 'gaussian_ply_path' from DepthAnything_V3 node.

Filtering options:
- filter_sky: Remove Gaussians in sky regions (requires sky_mask from DepthAnything_V3)
- depth_prune_percent: Keep only closest X% of Gaussians by depth (0.9 = keep 90%)
- opacity_threshold: Remove low-opacity Gaussians

Output: Path to filtered PLY file (compatible with SuperSplat, gsplat.js, 3DGS viewers)


DA3_FilterGaussians


Fuse multi-view depth maps into a single world-space point cloud.

Uses predicted camera poses (extrinsics) to transform each view's depth
into a common world coordinate system, then combines all points.

Inputs:
- depths: Batch of depth maps [N, H, W, 3] from Multi-View 3D node
- images: Original images [N, H, W, 3] for RGB colors
- extrinsics: Camera poses JSON from Multi-View node
- intrinsics: Camera intrinsics JSON from Multi-View node
- confidence: Optional confidence maps to filter low-confidence points
- sky_mask: Optional sky segmentation to exclude sky pixels from point cloud
- use_icp: Refine alignment with ICP (slower but potentially more accurate)

Output: Single combined POINTCLOUD in world space.

Note: Requires Main series or Nested model (with camera pose prediction).
Mono/Metric models don't predict camera poses.


DA3_MultiViewPointCloud


Parse camera pose from DA3 JSON output.

Extracts camera position and rotation from extrinsics matrix,
and focal lengths from intrinsics matrix.

Inputs:
- extrinsics_json: JSON string from DA3 output
- intrinsics_json: JSON string from DA3 output
- batch_index: Which image's parameters to extract (default 0)

Outputs:
- cam_x/y/z: Camera position in world space
- rot_x/y/z: Camera rotation (Euler angles in degrees)
- fx/fy: Focal lengths


DA3_ParseCameraPose


Preview point cloud PLY files in 3D using VTK.js (scientific visualization).

Inputs:
- file_path: Path to PLY file (typically from DA3 Save Point Cloud node)
- color_mode:
  - RGB: Show original texture colors from PLY file
  - View ID: Color points by source view (requires view_id in PLY)

Features:
- VTK.js rendering engine
- Trackball camera controls
- Axis orientation widget
- Adjustable point size
- Toggle between RGB and view-based coloring
- Max 2M points

Controls:
- Left Mouse: Rotate view
- Right Mouse: Pan camera
- Mouse Wheel: Zoom in/out
- Slider: Adjust point size


DA3_PreviewPointCloud


Save point cloud to PLY file.

Always saves:
- Original RGB colors (if available)
- view_id as custom property (if available from multi-view fusion)
- Confidence values (if available)

Use DA3 Preview Point Cloud to visualize with different color modes.

Output directory: ComfyUI/output/
Returns file path for use with ComfyUI 3D viewer.


DA3_SavePointCloud


Convert DA3 depth map to textured 3D mesh (GLB format).

Uses grid-based triangulation to create a clean mesh from the depth map.
Automatically filters invalid regions (sky, low confidence, depth discontinuities).

Inputs:
- depth_raw: Metric depth map (from DepthAnything_V3 with normalization_mode="Raw")
- confidence: Confidence map
- intrinsics: Camera intrinsics (REQUIRED - connect from DepthAnything_V3)
- sky_mask: Sky segmentation (recommended - excludes sky from mesh)
- source_image: Source image for mesh texture

Parameters:
- confidence_threshold: Filter vertices below this confidence
- depth_edge_threshold: Skip triangles across large depth jumps (prevents artifacts)
- downsample: Reduce mesh density (higher = fewer triangles, faster)
- filename_prefix: Output filename prefix

Output: GLB file path


DA3_ToMesh


Convert DA3 depth map to 3D point cloud using proper camera geometry.
Uses geometric unprojection: P = K^(-1) * [u, v, 1]^T * depth

Inputs:
- depth_raw: Metric depth map (from DepthAnything_V3 with normalization_mode="Raw")
- confidence: Confidence map
- intrinsics: (Optional) Camera intrinsics JSON from DepthAnything_V3
  ⚠️ If not provided, uses estimated intrinsics (may cause warping)
- sky_mask: (Optional but RECOMMENDED) Sky segmentation - excludes sky from point cloud
- source_image: (Optional) Source image for point colors

Parameters:
- confidence_threshold: Filter points below this confidence (0-1)
- downsample: Take every Nth pixel (5 = 1/25th of points, faster)

Output POINTCLOUD contains:
- points: Nx3 array of 3D coordinates
- colors: Nx3 array of RGB colors (if source_image provided)
- confidence: Nx1 array of confidence values


DA3_ToPointCloud


Unified Depth Anything V3 node - all outputs, multiple normalization modes.

**Normalization Modes:**
- Standard: Original V3 min-max normalization (0-1 range, includes sky)
- V2-Style: Disparity-based with content-aware contrast (default, best for ControlNet)
  - Sky appears BLACK (like V2)
  - Content-only normalization with percentile-based contrast
  - Enhanced depth gradations via contrast boost
  - Subtle edge anti-aliasing for natural transitions
  - Contribution by Ltamann (TBG)
- Raw: No normalization, outputs metric depth (for 3D reconstruction/point clouds)

**Outputs (always available):**
- depth: Depth map (normalized or raw, depending on mode)
- confidence: Confidence map (normalized 0-1)
- ray_origin: Ray origin maps (for 3D, normalized for visualization)
- ray_direction: Ray direction maps (for 3D, normalized for visualization)
- extrinsics: Camera extrinsics (predicted camera pose)
- intrinsics: Camera intrinsics (predicted camera parameters)
- sky_mask: Sky segmentation (1=sky, 0=non-sky, Mono/Metric models only)
- gaussian_ply_path: Path to raw 3D Gaussians PLY (Giant model only, empty string if not supported)

**Optional Inputs:**
- camera_params: Connect DA3_CreateCameraParams for camera-conditioned estimation
- resize_method: How to handle patch size alignment (resize/crop/pad)
- invert_depth: Toggle output convention. OFF (default): close=bright. ON: far=bright.
- keep_model_size: Keep model's native output size instead of resizing back

**Note:** Ray maps and camera parameters only available for main series models.
Sky mask only available for Mono/Metric/Nested models.

Connect only the outputs you need - unused outputs are simply ignored.


DepthAnything_V3


Multi-view Depth Anything V3 - processes multiple images TOGETHER with cross-view attention.

Key difference from standard nodes:
- Standard: Processes images one-by-one (sequential, independent)
- Multi-view: Processes all images together (cross-attention, geometrically consistent)

Use this for:
- Video frames (temporal consistency)
- Multiple angles of same scene (SfM/reconstruction)
- Stereo pairs (left/right cameras)

**Normalization Modes:**
- Standard: Original V3 min-max normalization (0-1 range)
- V2-Style: Disparity-based with content-aware contrast (default, best for ControlNet)
  - Sky appears BLACK, content-only normalization
  - Contribution by Ltamann (TBG)
- Raw: No normalization, outputs metric depth (for 3D reconstruction)

**Optional Inputs:**
- resize_method: How to handle patch size alignment (resize/crop/pad)
- invert_depth: Toggle output convention. OFF (default): close=bright. ON: far=bright.
- keep_model_size: Keep model's native output size instead of resizing back (intrinsics stay accurate)

Input: Batch of images [N, H, W, 3]
Outputs (all normalized across views together for consistency):
- depth: Batch of consistent depth maps [N, H, W, 3]
- confidence: Confidence maps [N, H, W, 3]
- ray_origin: Ray origin maps (for 3D, normalized for visualization)
- ray_direction: Ray direction maps (for 3D, normalized for visualization)
- extrinsics: Predicted camera poses for each view (JSON)
- intrinsics: Camera intrinsics for each view (JSON) - auto-scaled if resized
- sky_mask: Sky segmentation [N, H, W] (Mono/Metric/Nested only)
- resized_rgb_image: RGB images matching depth output dimensions
- gaussian_ply_path: Path to raw 3D Gaussians PLY (Giant model only, empty string if not supported)

Note: All images must have the same resolution.
Higher N = more VRAM usage but better consistency.


DepthAnythingV3_MultiView


Models autodownload to `ComfyUI/models/depthanything3` from HuggingFace.

Supports all DA3 variants including Small, Base, Large, Giant, Mono, Metric, and Nested models.