UNPKG

@lobehub/chat

Version:

Lobe Chat - an open-source, high-performance chatbot framework that supports speech synthesis, multimodal, and extensible Function Call plugin system. Supports one-click free deployment of your private ChatGPT/LLM web application.

37 lines (23 loc) 1.95 kB
--- title: Enhancing Multimodal Interaction with Visual Recognition Models description: >- Explore how LobeChat integrates visual recognition capabilities into large language models, enabling multimodal interactions for enhanced user experiences. tags: - Visual Recognition - Multimodal Interaction - Large Language Models - LobeChat - Custom Model Configuration --- # Visual Model User Guide The ecosystem of large language models that support visual recognition is becoming increasingly rich. Starting from `gpt-4-vision`, LobeChat now supports various large language models with visual recognition capabilities, enabling LobeChat to have multimodal interaction capabilities. <Video alt={'Visual Model Usage'} src={'https://github.com/user-attachments/assets/1c6b4975-bfc3-4470-a934-558ff7a16941'} /> ## Image Input If the model you are currently using supports visual recognition, you can input image content by uploading a file or dragging the image directly into the input box. The model will automatically recognize the image content and provide feedback based on your prompts. <Image alt={'Image Input'} src={'https://github.com/user-attachments/assets/e6836560-8b05-4382-b761-d7624da4b0f1'} /> ## Visual Models In the model list, models with a `👁️` icon next to their names indicate that the model supports visual recognition. Selecting such a model allows you to send image content. <Image alt={'Visual Models'} src={'https://github.com/user-attachments/assets/fa07a326-04c8-4744-bb93-cef715d1d71f'} /> ## Custom Model Configuration If you need to add a custom model that is not currently in the list and explicitly supports visual recognition, you can enable the `Visual Recognition` feature in the `Custom Model Configuration` to allow the model to interact with images. <Image alt={'Custom Model Configuration'} src={'https://github.com/user-attachments/assets/c24718cc-402b-4298-b046-8b4aee610cbc'} />