ComfyUI Extension: ComfyUI Batch BBox Detector

Authored by rslosh

Created about a month ago

Updated 25 days ago

1 stars

Process batches of images (e.g., video frames) with ultralytics bbox detection in ComfyUI. This custom node package extends the standard BboxDetectorCombined to handle batches efficiently, making it ideal for processing 300+ video frames.

Custom Nodes (0)

README

ComfyUI Batch BBox Detector

Process batches of images (e.g., video frames) with ultralytics bbox detection in ComfyUI. This custom node package extends the standard BboxDetectorCombined to handle batches efficiently, making it ideal for processing 300+ video frames.

Features

Batch Processing: Handle multiple images in a single operation
Memory Efficient: Chunked processing prevents out-of-memory errors on large batches
Three Implementations: Choose the best approach for your workflow
Error Handling: Robust processing that doesn't crash on individual frame failures
ComfyUI Integration: Seamless integration with existing ImpactPack nodes

Installation

Method 1: ComfyUI Manager (Recommended when available)

Open ComfyUI Manager
Search for "Batch BBox Detector"
Click Install

Method 2: Manual Installation

Navigate to your ComfyUI custom_nodes directory:
```
cd ComfyUI/custom_nodes/
```

Clone or copy this repository:

git clone https://github.com/yourusername/comfyui_bbox_batch.git

Or manually create the directory structure:

mkdir comfyui_bbox_batch
cd comfyui_bbox_batch

Copy the following files:
- __init__.py
- nodes/ directory (with all its files)
- README.md
Restart ComfyUI

Node Types

1. BBox Detector (Batch)

Best for: Small to medium batches (< 100 frames)

Standard batch processor that processes each frame sequentially and returns stacked results.

Inputs:

bbox_detector (BBOX_DETECTOR): The detector model
images (IMAGE): Batch of images [B, H, W, C]
threshold (FLOAT): Detection confidence threshold (0.0-1.0, default: 0.5)
dilation (INT): Mask expansion amount (-512 to 512, default: 4)
return_type (OPTIONAL): "mask_only" or "image_with_boxes"

Outputs:

images (IMAGE): Processed images [B, H, W, C]
masks (MASK): Detection masks [B, H, W]

2. BBox Detector (Batch ForEach)

Best for: List-based workflows, when memory efficiency is critical

Uses ComfyUI's INPUT_IS_LIST pattern for more memory-efficient processing in certain workflows.

Inputs: Same as standard batch processor

Outputs: Lists of images and masks

3. BBox Detector (Batch Chunked) ⭐ RECOMMENDED

Best for: Large batches (300+ frames), video processing

Processes batches in chunks with memory management between chunks. This is the recommended node for video frame processing.

Inputs:

Same as standard batch processor, plus:
chunk_size (INT): Number of frames per chunk (1-128, default: 32)

Outputs: Same as standard batch processor

Memory Management:

Automatically clears CUDA cache between chunks
Progress tracking for large batches
Prevents OOM errors on long videos

Usage Examples

Basic Video Frame Processing

Workflow:
Video Loader → Video to Image Batch → BBox Detector (Batch Chunked) → Output

Recommended Settings for 300+ frames:

threshold: 0.5 (adjust based on detection quality needs)
dilation: 4 (increase for larger mask coverage)
chunk_size: 16-32 (lower for less VRAM, higher for faster processing)

Workflow with Mask Processing

Workflow:
Video Loader → Video to Image Batch → BBox Detector (Batch Chunked) → MaskToImage → Save Image
                                    ↓
                                  Save Mask

Integration with ImpactPack

Workflow:
Load Checkpoint → UltralyticsDetectorProvider → BBox Detector (Batch Chunked)
                                              ↓
Image Batch ──────────────────────────────────┘

Performance Tips

Memory Optimization

Adjust Chunk Size:
- 8GB VRAM: chunk_size = 8-16
- 12GB VRAM: chunk_size = 16-32
- 24GB+ VRAM: chunk_size = 32-64
Image Resolution:
- Lower resolution reduces memory usage
- Consider downscaling before detection if appropriate
Enable Mixed Precision:
- If your detector supports it, enable FP16/mixed precision

Speed Optimization

Use Appropriate Node:
- Small batches (< 50): Use standard BboxDetectorBatch
- Large batches (> 100): Use BboxDetectorBatchChunked
Batch Processing:
- Larger chunk sizes = faster processing (if memory allows)
- Balance between speed and stability
GPU Utilization:
- Ensure CUDA is available and properly configured
- Monitor GPU usage with nvidia-smi

Parameters Explained

threshold (Float, 0.0-1.0)

Controls detection confidence. Higher values = more confident detections but may miss objects.

0.3-0.4: Detect more objects, more false positives
0.5 (default): Balanced detection
0.6-0.8: Fewer detections, higher confidence

dilation (Int, -512 to 512)

Expands or contracts the detection mask.

Negative values: Contract mask (useful for precise boundaries)
0: No change
Positive values: Expand mask (useful for including surrounding context)
4 (default): Slight expansion for better coverage

chunk_size (Int, 1-128)

Number of frames to process before clearing memory cache.

Lower (8-16): Better memory efficiency, slower processing
Higher (32-64): Faster processing, more memory usage
32 (default): Good balance for most systems

Troubleshooting

Out of Memory Errors

Problem: RuntimeError: CUDA out of memory

Solutions:

Reduce chunk_size to 8 or 16
Lower input image resolution
Close other GPU-intensive applications
Use BboxDetectorBatchChunked instead of standard batch

Empty Masks

Problem: All masks are black/empty

Solutions:

Lower threshold (try 0.3-0.4)
Verify detector model is loaded correctly
Check if objects are present in images
Ensure image format is correct (RGB, proper range)

Slow Processing

Problem: Processing takes too long

Solutions:

Increase chunk_size (if memory allows)
Use BboxDetectorBatch for small batches
Reduce image resolution
Verify GPU is being utilized (check with nvidia-smi)

Detection Quality Issues

Problem: Poor detection results

Solutions:

Adjust threshold value
Try different bbox detector models
Ensure images are properly preprocessed
Check image quality and resolution

Technical Details

Tensor Formats

IMAGE tensors: [B, H, W, C] - Batch, Height, Width, Channels
MASK tensors: [B, H, W] - Batch, Height, Width
All tensors use the same device as input images

Error Handling

Individual frame failures are caught and logged
Failed frames return empty masks and original images
Processing continues even if some frames fail
Error messages indicate which frame failed and why

Memory Management

CUDA cache is cleared between chunks (chunked processor)
Tensors are kept on the same device throughout processing
Intermediate results are properly cleaned up

Compatibility

ComfyUI: Latest version
Dependencies:
- PyTorch
- NumPy
- Ultralytics (for bbox detection models)
- ImpactPack (for BBOX_DETECTOR type)

Development

Project Structure

comfyui_bbox_batch/
├── __init__.py                            # Package initialization with node mappings
├── nodes/
│   ├── __init__.py                        # Nodes package initialization
│   ├── bbox_batch_detector.py             # Standard batch processor
│   ├── bbox_batch_detector_foreach.py     # ForEach list-based processor
│   └── bbox_batch_detector_chunked.py     # Chunked processor for large batches
└── README.md                              # This file

Node Architecture

Each node class follows ComfyUI's node pattern:

INPUT_TYPES: Defines input parameters
RETURN_TYPES: Defines output types
FUNCTION: Name of the processing method
CATEGORY: Node category in UI

License

MIT License - See LICENSE file for details

Contributing

Contributions are welcome! Please:

Fork the repository
Create a feature branch
Make your changes
Submit a pull request

Support

Issues: Report bugs on GitHub Issues
Discussions: Ask questions in GitHub Discussions
ComfyUI Discord: Get help in the custom nodes channel

Changelog

Version 1.0.0 (Current)

Initial release
Three node implementations (Standard, ForEach, Chunked)
Support for batches of 300+ frames
Memory-efficient chunked processing
Comprehensive error handling

Credits

Inspired by BboxDetectorCombined from ImpactPack
Built for the ComfyUI community
Ultralytics for bbox detection capabilities

Acknowledgments

Thanks to the ComfyUI and ImpactPack communities for their excellent work and support.