ComfyUI Extension: ComfyUI_AudioTools

Authored by billwuhao

Created

Updated

11 stars

A ComfyUI node containing multiple audio processing tools.

Custom Nodes (0)

    README

    中文|English

    Audio Processing Related ComfyUI Nodes

    Audio is the bridge connecting text, video, and images. Videos without audio or text are bland. This project currently includes the following main nodes:

    • Automatically add subtitles to videos
    • Arbitrary time scale audio cropping
    • Audio volume, speed, pitch, echo processing, etc.
    • Remove silent parts from audio
    • Recording
    • Audio Watermark Embedding

    Examples:

    1, Add subtitles to video:

    2, Combine ComfyUI_EraX-WoW-Turbo for automatic speech recognition, and then add subtitles to the video:

    3, Combine ComfyUI_EraX-WoW-Turbo, ComfyUI_gemmax, ComfyUI_SparkTTS, ComfyUI-LatentSyncWrapper for automatic speech recognition, automatic translation, automatic voice cloning, automatic lip sync, automatic subtitle addition to video (detailed example workflow workflow-examples):

    4, Arbitrary time scale cropping of audio:

    5, Audio volume, speed, pitch, echo processing, etc.:

    6, Remove silent parts from audio and recording:

    7, Audio Watermark Embedding (Disable watermark embedding; if a watermark exists, it will be automatically detected):

    • To use this node, download all SilentCipher models and place them in the ComfyUI\models\TTS\SilentCipher\44_1_khz\73999_iteration directory.

    📣 Updates

    [2025-03-28]⚒️: Added watermark embedding node.

    [2025-03-26]⚒️: Released version v1.0.0.

    Installation

    Install sox and add it to the system path.

    cd ComfyUI/custom_nodes
    git clone https://github.com/billwuhao/ComfyUI_AudioTools.git
    cd ComfyUI_AudioTools
    pip install -r requirements.txt
    
    # python_embeded
    ./python_embeded/python.exe -m pip install -r requirements.txt