ComfyUI Extension: KittenTTS Node for Voice Generation
Ultra-lightweight text-to-speech model with just 15 million parameters
Custom Nodes (0)
README
KittenTTS Node for Voice Generation
This Python class defines a custom node KittenTTS
intended for generating audio from text using a selection of predefined voices. It is categorized under "utils"
and integrates with a node-based workflow system.
Description
- The
KittenTTS
node accepts a text string and a voice selection from a fixed list of available voice models. - It calls the
generate_audio
function (imported from the local.generate_voice
module) to produce synthesized speech audio. - The node outputs the generated audio data, making it usable in downstream audio processing or playback nodes.
Input Parameters
-
text (
STRING
):
The input text that will be converted to speech. -
voice (
STRING
):
The voice model to use for synthesis. Allowed values are:expr-voice-2-m
expr-voice-2-f
expr-voice-3-m
expr-voice-3-f
expr-voice-4-m
expr-voice-4-f
expr-voice-5-m
expr-voice-5-f
Return Value
- Returns a tuple containing one item of type
AUDIO
— the synthesized speech audio.
Integration Details
INPUT_TYPES
class method defines the input interface, listing required parameters and valid voice options.RETURN_TYPES
specifies the output type as"AUDIO"
.FUNCTION
names the methodapply_generate_voice
to be called when executing this node.CATEGORY
groups this node under"utils"
.NODE_CLASS_MAPPINGS
links the identifier"KittenTTS"
to theKittenTTS
class.NODE_DISPLAY_NAME_MAPPINGS
sets the display name as"Generate voice"
for UI presentation.
Usage Example
kitten_tts_node = KittenTTS()
audio_output, = kitten_tts_node.apply_generate_voice(
text="Hello, how are you today?",
voice="expr-voice-3-f"
)
# `audio_output` contains the generated audio data, ready for playback or saving.