ComfyUI Extension: ComfyUI-LexTools

Authored by SOELexicon

Created

Updated

28 stars

ComfyUI-LexTools is a Python-based image processing and analysis toolkit that uses machine learning models for semantic image segmentation, image scoring, and image captioning.

README

ComfyUI-LexTools

ComfyUI-LexTools is a Python-based image processing and analysis toolkit that uses machine learning models for semantic image segmentation, image scoring, and image captioning. The toolkit includes three primary components:

  1. ImageProcessingNode.py - Implements various image processing nodes such as:

    • ImageAspectPadNode: Expands the image to meet a specific aspect ratio. This node is useful for maintaining the aspect ratio when processing images.
      • Inputs:
        • Required: image (IMAGE), aspect_ratio (RATIO), invert_ratio (BOOLEAN), feathering (INT), left_padding (INT), right_padding (INT), top_padding (INT), bottom_padding (INT)
        • Optional: show_on_node (INT)
      • Output: Expanded Image.
    • ImageScaleToMin: Calculates the value needed to rescale an image's smallest dimension to 512. This node is useful for scaling images down to 512 or up to 512 for faster processing. It ensures that at least one dimension (width or height) is 512 pixels.
      • Input: image (IMAGE)
      • Output: Scale value.
    • ImageRankingNode: Ranks the images based on specific criteria.
      • Input: score (INT), prompt (STRING), image_path (STRING), json_file_path (STRING)
      • Output: Ranked images.
    • ImageFilterByIntScoreNode and ImageFilterByFloatScoreNode: Filter images based on a threshold score. Currently, these nodes may throw errors if the following node in the sequence does not handle blank outputs.
      • Input: score (INT for ImageFilterByIntScoreNode and FLOAT for ImageFilterByFloatScoreNode), threshold (FLOAT), image (IMAGE)
      • Output: Filtered images.
    • ImageQualityScoreNode: Calculates a quality score for the image.
      • Input: aesthetic_score (INT), image_score_good (INT), image_score_bad (INT), ai_score_artificial (INT), ai_score_human (INT), weight_good_score (INT), weight_aesthetic_score (INT), weight_bad_score (INT), weight_AIDetection (INT), MultiplyScoreBy (INT), show_on_node (INT), weight_HumanDetection (INT)
      • Output: Quality score.
    • ScoreConverterNode: Converts the score to different data types.
      • Input: score (SCORE)
      • Output: Converted score.

    Additional nodes from GitHub Pages - These have been modified to improve performance and add an option to store the model in RAM, which significantly reduces generation time:

    • CalculateAestheticScore: An optimized version of the original, with an option to keep the model loaded in RAM. (No specific input or output detailed in the provided code)
    • AesthetlcScoreSorter: Sorts the images by score. (No specific input or output detailed in the provided code)
    • AesteticModel: Loads the aesthetic model. (No specific input or output detailed in the provided code)
  2. ImageCaptioningNode.py - Implements nodes for image captioning and classification:

    • ImageCaptioningNode: Provides a caption for the image.
      • Input: image (IMAGE)
      • Output: String caption.
    • FoodCategoryNode: Classifies the food category of an image.
      • Input: image (IMAGE)
      • Output: String category.
    • AgeClassifierNode: Classifies the age of a person in the image.
      • Input: image (IMAGE)
      • Output: String age range.
    • ImageClassifierNode: General image classification.
      • Input: image (IMAGE), show_on_node (BOOL)
      • Output: String label, artificial_prob (INT), human_prob (INT)
    • ClassifierNode: A generic classifier node.
      • Input: image (IMAGE)
      • Output: String label.
  3. SegformerNode.py - Handles semantic segmentation of images. It includes various nodes such as:

    • SegformerNode: Performs segmentation of the image.
      • Input: image (IMAGE), model_name (STRING), show_on_node (BOOL)
      • Output: Segmented image.
    • SegformerNodeMasks: Provides masks for the segmented images.
      • Input: No specific input detailed in the provided code.
      • Output: Image masks.
    • SegformerNodeMergeSegments: Merges certain segments in the segmented image.
      • Input: image (IMAGE), segments_to_merge (STRING), model_name (STRING), blur_radius (INT), dilation_radius (INT), intensity (INT), ceiling (INT), show_on_node (BOOL)
      • Output: Image with merged segments.
    • SeedIncrementerNode: Increment the seed used for random processes.
      • Input: seed (INT), increment_at (INT)
      • Output: Incremented seed.
    • StepCfgIncrementNode: Calculates the step configuration for the process.
      • Input: seed (INT), cfg_start (INT), steps_start (INT), img_steps (INT), max_steps (INT)
      • Output: Calculated step configuration.

Requirements

The project primarily uses the following libraries:

  • Python
  • Torch
  • Transformers
  • PIL
  • Matplotlib
  • Numpy
  • IO
  • Scipy

Installation

To install the necessary libraries, run:

pip install torch transformers pillow matplotlib numpy scipy

Contributing

Contributions to this project are welcome. If you find a bug or think of a feature that would benefit the project, please open an issue. If you'd like to contribute code, please open a pull request.