@lobehub/chat

Version:

Lobe Chat - an open-source, high-performance chatbot framework that supports speech synthesis, multimodal, and extensible Function Call plugin system. Supports one-click free deployment of your private ChatGPT/LLM web application.

github.com/lobehub/lobe-chat

lobehub/lobe-chat

163 lines (128 loc) • 5.64 kB

text/mdx

# Adding New Image Models > Learn more about the AI image generation modal design in the [AI Image Generation Modal Design Discussion](https://github.com/lobehub/lobe-chat/discussions/7442) ## Parameter Standardization All image generation models must use the standard parameters defined in `src/libs/standard-parameters/index.ts`. This ensures parameter consistency across different Providers, creating a more unified user experience. **Supported Standard Parameters**: - `prompt` (required): Text prompt for image generation - `aspectRatio`: Aspect ratio (e.g., "16:9", "1:1") - `width` / `height`: Image dimensions - `size`: Preset dimensions (e.g., "1024x1024") - `seed`: Random seed - `steps`: Generation steps - `cfg`: Guidance scale - For other parameters, please check the source file ## OpenAI Compatible Models These models can be requested using the OpenAI SDK, with request parameters and return values consistent with DALL-E and GPT-Image-X series. Taking Zhipu's CogView-4 as an example, which is an OpenAI-compatible model, you can add it by adding the model configuration in the corresponding AI models file `src/config/aiModels/zhipu.ts`: ```ts const zhipuImageModels: AIImageModelCard[] = [ // Add model configuration // https://bigmodel.cn/dev/howuse/image-generation-model/cogview-4 { description: 'CogView-4 is the first open-source text-to-image model from Zhipu that supports Chinese character generation, with comprehensive improvements in semantic understanding, image generation quality, and Chinese-English text generation capabilities.', displayName: 'CogView-4', enabled: true, id: 'cogview-4', parameters: { prompt: { default: '', }, size: { default: '1024x1024', enum: ['1024x1024', '768x1344', '864x1152', '1344x768', '1152x864', '1440x720', '720x1440'], }, }, releasedAt: '2025-03-04', type: 'image', }, ]; ``` ## Non-OpenAI Compatible Models For image generation models that are not compatible with OpenAI format, you need to implement a custom `createImage` method. There are two main implementation approaches: ### Method 1: Using OpenAI Compatible Factory Most Providers use `openaiCompatibleFactory` for OpenAI compatibility. You can pass in a custom `createImage` function (reference [PR #8534](https://github.com/lobehub/lobe-chat/pull/8534)). **Implementation Steps**: 1. **Read Provider documentation and standard parameter definitions** - Review the Provider's image generation API documentation to understand request and response formats - Read `src/libs/standard-parameters/index.ts` to understand supported parameters - Add image model configuration in the corresponding AI models file 2. **Implement custom createImage method** - Create a standalone image generation function that accepts standard parameters - Convert standard parameters to Provider-specific format - Call the Provider's image generation API - Return a unified response format (imageUrl and optional width/height) 3. **Add tests** - Write unit tests covering success scenarios - Test various error cases and edge conditions **Code Example**: ```ts // src/libs/model-runtime/provider-name/createImage.ts export const createProviderImage = async ( payload: ImageGenerationPayload, options: any, ): Promise<ImageGenerationResponse> => { const { model, prompt, ...params } = payload; // Call Provider's native API const result = await callProviderAPI({ model, prompt, // Convert parameter format custom_param: params.width, // ... }); // Return unified format return { created: Date.now(), data: [{ url: result.imageUrl }], }; }; ``` ```ts // src/libs/model-runtime/provider-name/index.ts export const LobeProviderAI = openaiCompatibleFactory({ constructorOptions: { // ... other configurations }, createImage: createProviderImage, // Pass custom implementation provider: ModelProvider.ProviderName, }); ``` ### Method 2: Direct Implementation in Provider Class If your Provider has an independent class implementation, you can directly add the `createImage` method in the class (reference [PR #8503](https://github.com/lobehub/lobe-chat/pull/8503)). **Implementation Steps**: 1. **Read Provider documentation and standard parameter definitions** - Review the Provider's image generation API documentation - Read `src/libs/standard-parameters/index.ts` - Add image model configuration in the corresponding AI models file 2. **Implement createImage method in Provider class** - Add the `createImage` method directly in the class - Handle parameter conversion and API calls - Return a unified response format 3. **Add tests** - Write comprehensive test cases for the new method **Code Example**: ```ts // src/libs/model-runtime/provider-name/index.ts export class LobeProviderAI { async createImage( payload: ImageGenerationPayload, options?: ChatStreamCallbacks, ): Promise<ImageGenerationResponse> { const { model, prompt, ...params } = payload; // Call native API and handle response const result = await this.client.generateImage({ model, prompt, // Parameter conversion }); return { created: Date.now(), data: [{ url: result.url }], }; } } ``` ### Important Notes - **Testing Requirements**: Add comprehensive unit tests for custom implementations, ensuring coverage of success scenarios and various error cases - **Error Handling**: Use `AgentRuntimeError` consistently for error wrapping to maintain error message consistency