ComfyUI Extension: Stable Virtual Camera

Authored by Pablerdo

Created

Updated

1 stars

Generative View Synthesis with Diffusion Models

Custom Nodes (0)

    README

    <div align="center"> <h1>Stable Virtual Camera: Generative View Synthesis with Diffusion Models</h1>

    <a href="https://stable-virtual-camera.github.io"><img src="https://img.shields.io/badge/%F0%9F%8F%A0%20Project%20Page-gray.svg"></a> <a href="http://arxiv.org/abs/2503.14489"><img src="https://img.shields.io/badge/%F0%9F%93%84%20arXiv-2503.14489-B31B1B.svg"></a> <a href="https://stability.ai/news/introducing-stable-virtual-camera-multi-view-video-generation-with-3d-camera-control"><img src="https://img.shields.io/badge/%F0%9F%93%83%20Blog-Stability%20AI-orange.svg"></a> <a href="https://huggingface.co/stabilityai/stable-virtual-camera"><img src="https://img.shields.io/badge/%F0%9F%A4%97%20Model_Card-Huggingface-orange"></a> <a href="https://huggingface.co/spaces/stabilityai/stable-virtual-camera"><img src="https://img.shields.io/badge/%F0%9F%9A%80%20Gradio%20Demo-Huggingface-orange"></a> <a href="https://www.youtube.com/channel/UCLLlVDcS7nNenT_zzO3OPxQ"><img src="https://img.shields.io/badge/%F0%9F%8E%AC%20Video-YouTube-orange"></a>

    Jensen (Jinghao) Zhou*, Hang Gao* <br> Vikram Voleti, Aaryaman Vasishta, Chun-Han Yao, Mark Boss <br> Philip Torr, Christian Rupprecht, Varun Jampani <br> <sub>Stability AI, University of Oxford, UC Berkeley</sub>

    </div> <p align="center"> <img src="assets/spiral.gif" width="100%" alt="Teaser" style="border-radius:10px;"/> </p> <p align="center" border-radius="10px"> <img src="assets/benchmark.png" width="100%" alt="teaser_page1"/> </p>

    Overview

    Stable Virtual Camera (Seva) is a 1.3B generalist diffusion model for Novel View Synthesis (NVS), generating 3D consistent novel views of a scene, given any number of input views and target cameras.

    :tada: News

    • March 2025 - Stable Virtual Camera is out everywhere.

    :wrench: Installation

    git clone --recursive https://github.com/Stability-AI/stable-virtual-camera
    cd stable-virtual-camera
    pip install -e .
    

    Please note that you will need python>=3.10 and torch>=2.6.0.

    Check INSTALL.md for other dependencies if you want to use our demos or develop from this repo. For windows users, please use WSL as flash attention isn't supported on native Windows yet.

    :open_book: Usage

    You need to properly authenticate with Hugging Face to download our model weights. Once set up, our code will handle it automatically at your first run. You can authenticate by running

    # This will prompt you to enter your Hugging Face credentials.
    huggingface-cli login
    

    Once authenticated, go to our model card here and enter your information for access.

    We provide two demos for you to interact with Stable Virtual Camera.

    :rocket: Gradio demo

    This gradio demo is a GUI interface that requires no expert knowledge, suitable for general users. Simply run

    python demo_gr.py
    

    For a more detailed guide, follow GR_USAGE.md.

    :computer: CLI demo

    This cli demo allows you to pass in more options and control the model in a fine-grained way, suitable for power users and academic researchers. An example command line looks as simple as

    python demo.py --data_path <data_path> [additional arguments]
    

    For a more detailed guide, follow CLI_USAGE.md.

    For users interested in benchmarking NVS models using command lines, check benchmark containing the details about scenes, splits, and input/target views we reported in the <a href="http://arxiv.org/abs/2503.14489">paper</a>.

    :question: Q&A

    • Training script? See issue https://github.com/Stability-AI/stable-virtual-camera/issues/27, https://github.com/Stability-AI/stable-virtual-camera/issues/42.
    • License for the output? See issue https://github.com/Stability-AI/stable-virtual-camera/issues/26. The output follows the same non-commercial license.

    :books: Citing

    If you find this repository useful, please consider giving a star :star: and citation.

    @article{zhou2025stable,
        title={Stable Virtual Camera: Generative View Synthesis with Diffusion Models},
        author={Jensen (Jinghao) Zhou and Hang Gao and Vikram Voleti and Aaryaman Vasishta and Chun-Han Yao and Mark Boss and
        Philip Torr and Christian Rupprecht and Varun Jampani
        },
        journal={arXiv preprint arXiv:2503.14489},
        year={2025}
    }