ComfyUI Extension: HunyuanVideo-Foley Audio Generator
ComfyUI wrapper nodes for HunyuanVideo-Foley: Generate audio from video + text prompts
Custom Nodes (0)
README
ComfyUI-HunyuanVideoFoley
English | įŽäŊ䏿
A ComfyUI custom node integrating the inference pipeline of HunyuanVideo-Foley, enabling audio generation from video + text prompts, plus audio-video merging utilities.
Features
- đĩ Auto model download on first use (from Hugging Face)
- đŦ Video-aware audio generation guided by text prompts
- đ§ Audio-video merging to produce a final video with an audio track
Nodes
1) HunyuanVideo-Foley Generate Audio
Generate an AUDIO tensor from a VIDEO input and a text prompt.
Inputs:
video
: VIDEO (LoadVideo / CreateVideo output)prompt
: Text prompt describing the desired audiomodel
: Model setup (auto-detected undermodels/hunyuan_foley
)guidance_scale
: Float, default 4.5num_inference_steps
: Int, default 50device
: auto/cpu/cuda/mpsgpu_id
: CUDA device id
Output:
audio
:{ "waveform": [B=1, C=1, T], "sample_rate": int }
2) Video Audio Merger
Merge AUDIO and VIDEO into a single video file with an audio track.
Inputs:
video
: VIDEO inputaudio
: AUDIO inputaudio_sync_mode
:stretch
|loop
|truncate
|pad_silence
video_codec
:copy
|libx264
|libx265
audio_codec
:aac
|mp3
|copy
quality
:high
|medium
|low
Output:
merged_video
: VIDEO
Quick Start
LoadVideo â HunyuanVideoFoleyGenerateAudio â VideoAudioMerger â SaveVideo
â â
text prompt audio input
- Load a video (LoadVideo)
- Generate audio via "HunyuanVideo-Foley Generate Audio"
- Merge audio + video via "Video Audio Merger"
- Save or preview the final video
Dependencies
# Use your ComfyUI Python environment
D:\AI\ComfyUI-aki-v1.3\.ext\python.exe -m pip install ffmpeg-python
Notes
- On first run, models will be downloaded automatically (internet required)
- FFmpeg is required for audio/video processing
- CUDA is recommended for best performance