ComfyUI Extension: ComfyUI_AudioTools

Authored by billwuhao

Created

Updated

46 stars

A ComfyUI node containing multiple audio processing tools.

Custom Nodes (0)

    README

    δΈ­ζ–‡|English

    ComfyUI Nodes for Audio Processing and Related Tasks

    πŸ“£ Updates

    [2025-06-03]βš’οΈ: v1.2.0. Added Music/Vocal Separation, Vocal Extraction, and Audio Merging nodes. Download models TIGER-speech, TIGER-DnR, and place the entire folders into the models\TTS directory.

    [2025-05-27]βš’οΈ: Added Audio Denoising and Enhancement node. Download model last_best_checkpoint.pt and place it into the models\TTS\MossFormer2_SE_48K directory.

    [2025-05-23]βš’οΈ: Fixed logic issues with the Pause node. Now, the Pause node will pause on the first execution when connected in series or parallel. On subsequent executions, it will automatically pass if the preceding nodes have not changed.

    [2025-04-28]βš’οΈ: Audio Loading, with custom loading paths including subdirectories.

    [2025-04-26]βš’οΈ: Pause workflow anywhere.

    [2025-04-25]βš’οΈ: String Editing.

    [2025-03-28]βš’οΈ: Added Watermark Embedding node.

    [2025-03-26]βš’οΈ: Released version v1.0.0.

    πŸ“– Introduction

    Audio acts as a bridge connecting text, video, and images. A video without audio or text is tasteless. This project currently includes the following main nodes:

    • Music/Vocal Separation, Vocal Extraction, Audio Merging, Audio Concatenation
    • Audio Denoising and Enhancement
    • Pause workflow anywhere
    • Audio Loading, with custom loading paths including subdirectories
      • Please rename the extra_help_file.yaml.example file to extra_help_file.yaml, uncomment # , and add custom loading directories like audios_dir: D:\AIGC\ComfyUI-Data\audios_input. For Linux, use /.
    • String Editing.
    • Automatic Video Subtitling
    • Audio Trimming at Arbitrary Time Markers
    • Audio Volume, Speed, Pitch, Echo Processing, etc.
    • Remove Silent Parts from Audio
    • Audio Recording
    • Audio Watermark Embedding

    Examples:

    • Music/Vocal Separation:

    • Vocal Separation and Extraction:

    • Merge Audio:

    • Denoising and Enhancement:

    • Audio Loading:

    • String Editing.

    • Add Subtitles to Video:

    • Trim Audio at Arbitrary Time Markers:

    • Audio Volume, Speed, Pitch, Echo Processing, etc.:

    • Audio Recording and Remove Silent Parts:

    • Audio Watermark Embedding (Embedding disabled, if watermark exists, it will be automatically detected):

    • To use this node, download all models from SilentCipher and place them into the ComfyUI\models\TTS\SilentCipher\44_1_khz\73999_iteration directory.

    Installation

    Install sox and add it to your system's PATH.

    cd ComfyUI/custom_nodes
    git clone https://github.com/billwuhao/ComfyUI_AudioTools.git
    cd ComfyUI_AudioTools
    pip install -r requirements.txt
    
    # python_embeded
    ./python_embeded/python.exe -m pip install -r requirements.txt
    

    Acknowledgments