ComfyUI Extension: ComfyUI_pixtral_vision

Authored by ShmuelRonen

Created 10 months ago

Updated 8 months ago

16 stars

The ComfyUI_pixtral_vision is a powerful ComfyUI node designed to integrate seamlessly with the Mistral Pixtral API. It facilitates the analysis of images through deep learning models, interpreting and describing the visual content. Users can input an image directly and provide prompts for context, utilizing an API key for authentication.

Custom Nodes (3)

README

Update 20 nov

add preview_text node

Update 28 sep

Add maximum_tokens option

Update 25 sep

Add multiply images input node - 'Multi Images Input'

ComfyUI_pixtral_vision

The ComfyUI_pixtral_vision is a powerful ComfyUI node designed to integrate seamlessly with the Mistral Pixtral API. It facilitates the analysis of images through deep learning models, interpreting and describing the visual content. Users can input an image directly and provide prompts for context, utilizing an API key for authentication.

Overview

The ComfyUI_pixtral_vision node integrates with the Mistral Pixtral API to provide advanced image analysis capabilities within the ComfyUI framework. This node allows users to upload images and receive descriptive insights generated by deep learning models. It is particularly useful for applications requiring detailed visual understanding and content description.

Features

Image Analysis: Analyze images using the state-of-the-art Pixtral 12B model.
Dynamic Interactions: Adjust the randomness of responses with a temperature control.
Secure API Integration: Utilizes an API key for authenticated access to the Mistral Pixtral API.

Installation

To install the ComfyUI_pixtral_vision node, follow these steps:

Clone the repository:

git clone https://github.com/yourusername/ComfyUI_pixtral_vision.git

Navigate to the cloned directory:
```
cd ComfyUI_pixtral_vision
```
Install the required dependencies:
```
pip install -r requirements.txt
```

Getting the free API Key

Visit Mistral AI and sign up or log into your account.
Navigate to the API section and follow the instructions to generate a new API key.
Once you have your API key, enter it into the node configuration as described in the setup instructions.

Usage

To use the node, input an image and a prompt describing what you are looking for in the image. Adjust the temperature setting as needed to control the response's randomness.

Credits

This project utilizes the Mistral Pixtral API. For more detailed information about the API, visit the official documentation.

References

Mistral Pixtral API Documentation: Mistral Docs
ComfyUI Framework: ComfyUI GitHub

For support, feature requests, or contributions, please visit the project's GitHub page.


This README includes a technical description of the node, installation instructions, guidance on obtaining an API key, usage instructions, and links to relevant resources. Adjust the GitHub URLs and any specific instructions according to your actual repository and setup details.