@lobehub/chat

Version:

Lobe Chat - an open-source, high-performance chatbot framework that supports speech synthesis, multimodal, and extensible Function Call plugin system. Supports one-click free deployment of your private ChatGPT/LLM web application.

github.com/lobehub/lobe-chat

lobehub/lobe-chat

347 lines (270 loc) • 14.3 kB

text/mdx

# Lobe Chat API Client-Server Interaction Logic This document explains the implementation logic of Lobe Chat API in client-server interactions, including event sequences and core components involved. ## Interaction Sequence Diagram ```mermaid sequenceDiagram participant Client as Frontend Client participant ChatService as Frontend ChatService participant ChatAPI as Backend Chat API participant AgentRuntime as AgentRuntime participant ModelProvider as Model Provider API participant PluginGateway as Plugin Gateway Client->>ChatService: Call createAssistantMessage Note over ChatService: Process messages, tools, and parameters ChatService->>ChatService: Call getChatCompletion Note over ChatService: Prepare request parameters ChatService->>ChatAPI: Send POST request to /webapi/chat/[provider] ChatAPI->>AgentRuntime: Initialize AgentRuntime Note over AgentRuntime: Create runtime with provider and user config ChatAPI->>AgentRuntime: Call chat method AgentRuntime->>ModelProvider: Send chat completion request ModelProvider-->>AgentRuntime: Return streaming response AgentRuntime-->>ChatAPI: Process response and return stream ChatAPI-->>ChatService: Stream back SSE response ChatService->>ChatService: Handle streaming response with fetchSSE Note over ChatService: Process event stream with fetchEventSource loop For each data chunk ChatService->>ChatService: Handle different event types (text, tool_calls, reasoning, etc.) ChatService-->>Client: Return current chunk via onMessageHandle callback end ChatService-->>Client: Return complete result via onFinish callback Note over ChatService,ModelProvider: Plugin calling scenario ModelProvider-->>ChatService: Return response with tool_calls ChatService->>ChatService: Parse tool calls ChatService->>ChatService: Call runPluginApi ChatService->>PluginGateway: Send plugin request to gateway PluginGateway-->>ChatService: Return plugin execution result ChatService->>ModelProvider: Return plugin result to model ModelProvider-->>ChatService: Generate final response based on plugin result Note over ChatService,ModelProvider: Preset task scenario Client->>ChatService: Trigger preset task (e.g., translation, search) ChatService->>ChatService: Call fetchPresetTaskResult ChatService->>ChatAPI: Send preset task request ChatAPI-->>ChatService: Return task result ChatService-->>Client: Return result via callback function ``` ## Main Process Steps 1. **Client Initiates Request**: The client calls the createAssistantMessage method of the frontend ChatService. 2. **Frontend Processes Request**: - `src/services/chat.ts` preprocesses messages, tools, and parameters - Calls getChatCompletion to prepare request parameters - Uses `src/utils/fetch/fetchSSE.ts` to send request to backend API 3. **Backend Processes Request**: - `src/app/(backend)/webapi/chat/[provider]/route.ts` receives the request - Initializes AgentRuntime - Creates the appropriate model instance based on user configuration and provider 4. **Model Call**: - `src/libs/agent-runtime/AgentRuntime.ts` calls the respective model provider's API - Returns streaming response 5. **Process Response**: - Backend converts model response to Stream and returns it - Frontend processes streaming response via fetchSSE and [fetchEventSource](https://github.com/Azure/fetch-event-source) - Handles different types of events (text, tool calls, reasoning, etc.) - Passes results back to client through callback functions 6. **Plugin Calling Scenario**: When the AI model returns a `tool_calls` field in its response, it triggers the plugin calling process: - AI model returns response containing `tool_calls`, indicating a need to call tools - Frontend handles tool calls via the `internal_callPluginApi` method - Calls `runPluginApi` method to execute plugin functionality, including retrieving plugin settings and manifest, creating authentication headers, and sending requests to the plugin gateway - After plugin execution completes, the result is returned to the AI model, which generates the final response based on the result **Real-world Examples**: - **Search Plugin**: When a user needs real-time information, the AI calls a web search plugin to retrieve the latest data - **DALL-E Plugin**: When a user requests image generation, the AI calls the DALL-E plugin to create images - **Midjourney Plugin**: Provides higher quality image generation capabilities by calling the Midjourney service via API 7. **Preset Task Processing**: Preset tasks are specific predefined functions that are typically triggered when users perform specific actions (rather than being part of the regular chat flow). These tasks use the `fetchPresetTaskResult` method, which is similar to the normal chat flow but uses specially designed prompt chains. **Execution Timing**: Preset tasks are mainly triggered in the following scenarios: 1. **Agent Information Auto-generation**: Triggered when users create or edit an agent - Agent avatar generation (via `autoPickEmoji` method) - Agent description generation (via `autocompleteAgentDescription` method) - Agent tag generation (via `autocompleteAgentTags` method) - Agent title generation (via `autocompleteAgentTitle` method) 2. **Message Translation**: Triggered when users manually click the translate button (via `translateMessage` method) 3. **Web Search**: When search is enabled but the model doesn't support tool calling, search functionality is implemented via `fetchPresetTaskResult` **Code Examples**: Agent avatar auto-generation implementation: ```ts // src/features/AgentSetting/store/action.ts autoPickEmoji: async () => { const { config, meta, dispatchMeta } = get(); const systemRole = config.systemRole; chatService.fetchPresetTaskResult({ onFinish: async (emoji) => { dispatchMeta({ type: 'update', value: { avatar: emoji } }); }, onLoadingChange: (loading) => { get().updateLoadingState('avatar', loading); }, params: merge( get().internal_getSystemAgentForMeta(), chainPickEmoji([meta.title, meta.description, systemRole].filter(Boolean).join(',')), ), trace: get().getCurrentTracePayload({ traceName: TraceNameMap.EmojiPicker }), }); }; ``` Translation feature implementation: ```ts // src/store/chat/slices/translate/action.ts translateMessage: async (id, targetLang) => { // ...omitted code... // Detect language chatService.fetchPresetTaskResult({ onFinish: async (data) => { if (data && supportLocales.includes(data)) from = data; await updateMessageTranslate(id, { content, from, to: targetLang }); }, params: merge(translationSetting, chainLangDetect(message.content)), trace: get().getCurrentTracePayload({ traceName: TraceNameMap.LanguageDetect }), }); // Perform translation chatService.fetchPresetTaskResult({ onMessageHandle: (chunk) => { if (chunk.type === 'text') { content = chunk.text; internal_dispatchMessage({ id, type: 'updateMessageTranslate', value: { content, from, to: targetLang }, }); } }, onFinish: async () => { await updateMessageTranslate(id, { content, from, to: targetLang }); internal_toggleChatLoading(false, id, n('translateMessage(end)', { id }) as string); }, params: merge(translationSetting, chainTranslate(message.content, targetLang)), trace: get().getCurrentTracePayload({ traceName: TraceNameMap.Translation }), }); }; ``` 8. **Completion**: - When the stream ends, the onFinish callback is called, providing the complete response result ## AgentRuntime Overview AgentRuntime is a core abstraction layer in Lobe Chat that encapsulates a unified interface for interacting with different AI model providers. Its main responsibilities and features include: 1. **Unified Abstraction Layer**: AgentRuntime provides a unified interface that hides the implementation details and differences between various AI provider APIs (such as OpenAI, Anthropic, Bedrock, etc.). 2. **Model Initialization**: Through the static `initializeWithProvider` method, it initializes the corresponding runtime instance based on the specified provider and configuration parameters. 3. **Capability Encapsulation**: - `chat` method: Handles chat streaming requests - `models` method: Retrieves model lists - Supports text embedding, text-to-image, text-to-speech, and other functionalities (if supported by the model provider) 4. **Plugin Architecture**: Through the `src/libs/agent-runtime/runtimeMap.ts` mapping table, it implements an extensible plugin architecture, making it easy to add new model providers. Currently, it supports over 40 different model providers: ```ts export const providerRuntimeMap = { openai: LobeOpenAI, anthropic: LobeAnthropicAI, google: LobeGoogleAI, azure: LobeAzureOpenAI, bedrock: LobeBedrockAI, ollama: LobeOllamaAI, // ...over 40 other model providers }; ``` 5. **Adapter Pattern**: Internally, it uses the adapter pattern to adapt different provider APIs to the unified `src/libs/agent-runtime/BaseAI.ts` interface: ```ts export interface LobeRuntimeAI { baseURL?: string; chat(payload: ChatStreamPayload, options?: ChatCompetitionOptions): Promise<Response>; embeddings?(payload: EmbeddingsPayload, options?: EmbeddingsOptions): Promise<Embeddings[]>; models?(): Promise<any>; textToImage?: (payload: TextToImagePayload) => Promise<string[]>; textToSpeech?: ( payload: TextToSpeechPayload, options?: TextToSpeechOptions, ) => Promise<ArrayBuffer>; } ``` **Adapter Implementation Examples**: 1. **OpenRouter Adapter**: OpenRouter is a unified API that allows access to AI models from multiple providers. Lobe Chat implements support for OpenRouter through an adapter: ```ts // OpenRouter adapter implementation class LobeOpenRouterAI implements LobeRuntimeAI { client: OpenAI; baseURL: string; constructor(options: OpenAICompatibleOptions) { // Initialize OpenRouter client using OpenAI-compatible API this.client = new OpenAI({ apiKey: options.apiKey, baseURL: OPENROUTER_BASE_URL, defaultHeaders: { 'HTTP-Referer': 'https://github.com/lobehub/lobe-chat', 'X-Title': 'LobeChat', }, }); this.baseURL = OPENROUTER_BASE_URL; } // Implement chat functionality async chat(payload: ChatCompletionCreateParamsBase, options?: RequestOptions) { // Convert Lobe Chat request format to OpenRouter format // Handle model mapping, message format, etc. return this.client.chat.completions.create( { ...payload, model: payload.model || 'openai/gpt-4-turbo', // Default model }, options, ); } // Implement other LobeRuntimeAI interface methods } ``` 2. **Google Gemini Adapter**: Gemini is Google's large language model. Lobe Chat supports Gemini series models through a dedicated adapter: ```ts import { GoogleGenerativeAI } from '@google/generative-ai'; // Gemini adapter implementation class LobeGoogleAI implements LobeRuntimeAI { client: GoogleGenerativeAI; baseURL: string; apiKey: string; constructor(options: GoogleAIOptions) { // Initialize Google Generative AI client this.client = new GoogleGenerativeAI(options.apiKey); this.apiKey = options.apiKey; this.baseURL = options.baseURL || GOOGLE_AI_BASE_URL; } // Implement chat functionality async chat(payload: ChatCompletionCreateParamsBase, options?: RequestOptions) { // Select appropriate model (supports Gemini Pro, Gemini Flash, etc.) const modelName = payload.model || 'gemini-pro'; const model = this.client.getGenerativeModel({ model: modelName }); // Process multimodal inputs (e.g., images) const contents = this.processMessages(payload.messages); // Set generation parameters const generationConfig = { temperature: payload.temperature, topK: payload.top_k, topP: payload.top_p, maxOutputTokens: payload.max_tokens, }; // Create chat session and get response const chat = model.startChat({ generationConfig, history: contents.slice(0, -1), safetySettings: this.getSafetySettings(payload), }); // Handle streaming response return this.handleStreamResponse(chat, contents, options?.signal); } // Implement other processing methods private processMessages(messages) { /* ... */ } private getSafetySettings(payload) { /* ... */ } private handleStreamResponse(chat, contents, signal) { /* ... */ } } ``` **Different Model Implementations**: - `src/libs/agent-runtime/openai/index.ts` - OpenAI implementation - `src/libs/agent-runtime/anthropic/index.ts` - Anthropic implementation - `src/libs/agent-runtime/google/index.ts` - Google implementation - `src/libs/agent-runtime/openrouter/index.ts` - OpenRouter implementation For detailed implementation, see: - `src/libs/agent-runtime/AgentRuntime.ts` - Core runtime class - `src/libs/agent-runtime/BaseAI.ts` - Define base interface - `src/libs/agent-runtime/runtimeMap.ts` - Provider mapping table - `src/libs/agent-runtime/UniformRuntime/index.ts` - Handle multi-model unified runtime - `src/libs/agent-runtime/utils/openaiCompatibleFactory/index.ts` - OpenAI compatible adapter factory