ComfyUI Extension: KittenTTS Node for Voice Generation
Ultra-lightweight text-to-speech model with just 15 million parameters
Custom Nodes (0)
README
KittenTTS Node for Voice Generation
This Python class defines a custom node KittenTTS intended for generating audio from text using a selection of predefined voices. It is categorized under "utils" and integrates with a node-based workflow system.
Description
- The
KittenTTSnode accepts a text string and a voice selection from a fixed list of available voice models. - It calls the
generate_audiofunction (imported from the local.generate_voicemodule) to produce synthesized speech audio. - The node outputs the generated audio data, making it usable in downstream audio processing or playback nodes.
Input Parameters
-
text (
STRING):
The input text that will be converted to speech. -
voice (
STRING):
The voice model to use for synthesis. Allowed values are:expr-voice-2-mexpr-voice-2-fexpr-voice-3-mexpr-voice-3-fexpr-voice-4-mexpr-voice-4-fexpr-voice-5-mexpr-voice-5-f
Return Value
- Returns a tuple containing one item of type
AUDIO— the synthesized speech audio.
Integration Details
INPUT_TYPESclass method defines the input interface, listing required parameters and valid voice options.RETURN_TYPESspecifies the output type as"AUDIO".FUNCTIONnames the methodapply_generate_voiceto be called when executing this node.CATEGORYgroups this node under"utils".NODE_CLASS_MAPPINGSlinks the identifier"KittenTTS"to theKittenTTSclass.NODE_DISPLAY_NAME_MAPPINGSsets the display name as"Generate voice"for UI presentation.
Usage Example
kitten_tts_node = KittenTTS()
audio_output, = kitten_tts_node.apply_generate_voice(
text="Hello, how are you today?",
voice="expr-voice-3-f"
)
# `audio_output` contains the generated audio data, ready for playback or saving.