UNPKG

ai-sdk-provider-gemini-cli

Version:

Community AI SDK provider for Google Gemini using the official CLI/SDK

288 lines (243 loc) 8.01 kB
# LanguageModelV1 doGenerate Method Implementation Summary ## Overview The `doGenerate` method is the core non-streaming generation method that all Language Model V1 providers must implement. It's responsible for taking a standardized prompt and options, calling the underlying model API, and returning a standardized result. ## Key Interfaces and Types ### 1. LanguageModelV1 Interface The main interface that providers must implement: ```typescript export type LanguageModelV1 = { readonly specificationVersion: 'v1'; readonly provider: string; readonly modelId: string; readonly defaultObjectGenerationMode: 'json' | 'tool' | undefined; readonly supportsImageUrls?: boolean; readonly supportsStructuredOutputs?: boolean; doGenerate(options: LanguageModelV1CallOptions): PromiseLike<{ text?: string; reasoning?: string | Array<...>; files?: Array<{ data: string | Uint8Array; mimeType: string }>; toolCalls?: Array<LanguageModelV1FunctionToolCall>; finishReason: LanguageModelV1FinishReason; usage: { promptTokens: number; completionTokens: number }; rawCall: { rawPrompt: unknown; rawSettings: Record<string, unknown> }; rawResponse?: { headers?: Record<string, string>; body?: unknown }; request?: { body?: string }; response?: { id?: string; timestamp?: Date; modelId?: string }; warnings?: LanguageModelV1CallWarning[]; providerMetadata?: LanguageModelV1ProviderMetadata; sources?: LanguageModelV1Source[]; logprobs?: LanguageModelV1LogProbs; }>; doStream(options: LanguageModelV1CallOptions): PromiseLike<{ stream: ReadableStream<LanguageModelV1StreamPart>; // ... other properties }>; }; ``` ### 2. LanguageModelV1CallOptions The options passed to doGenerate: ```typescript export type LanguageModelV1CallOptions = LanguageModelV1CallSettings & { inputFormat: 'messages' | 'prompt'; mode: | { type: 'regular'; tools?: Array<LanguageModelV1FunctionTool | LanguageModelV1ProviderDefinedTool>; toolChoice?: LanguageModelV1ToolChoice; } | { type: 'object-json'; schema?: JSONSchema7; name?: string; description?: string; } | { type: 'object-tool'; tool: LanguageModelV1FunctionTool; }; prompt: LanguageModelV1Prompt; providerMetadata?: LanguageModelV1ProviderMetadata; }; ``` ### 3. LanguageModelV1CallSettings Common generation settings: ```typescript export type LanguageModelV1CallSettings = { maxTokens?: number; temperature?: number; stopSequences?: string[]; topP?: number; topK?: number; presencePenalty?: number; frequencyPenalty?: number; responseFormat?: | { type: 'text' } | { type: 'json'; schema?: JSONSchema7; name?: string; description?: string; }; seed?: number; abortSignal?: AbortSignal; headers?: Record<string, string | undefined>; }; ``` ### 4. LanguageModelV1Prompt The standardized prompt format: ```typescript export type LanguageModelV1Prompt = Array<LanguageModelV1Message>; export type LanguageModelV1Message = | { role: 'system'; content: string; } | { role: 'user'; content: Array< | LanguageModelV1TextPart | LanguageModelV1ImagePart | LanguageModelV1FilePart >; } | { role: 'assistant'; content: Array< | LanguageModelV1TextPart | LanguageModelV1FilePart | LanguageModelV1ReasoningPart | LanguageModelV1RedactedReasoningPart | LanguageModelV1ToolCallPart >; } | { role: 'tool'; content: Array<LanguageModelV1ToolResultPart>; }; ``` ### 5. Content Part Types #### Text Part ```typescript interface LanguageModelV1TextPart { type: 'text'; text: string; providerMetadata?: LanguageModelV1ProviderMetadata; } ``` #### Image Part ```typescript interface LanguageModelV1ImagePart { type: 'image'; image: Uint8Array | URL; mimeType?: string; providerMetadata?: LanguageModelV1ProviderMetadata; } ``` #### Tool Call Part ```typescript interface LanguageModelV1ToolCallPart { type: 'tool-call'; toolCallId: string; toolName: string; args: unknown; providerMetadata?: LanguageModelV1ProviderMetadata; } ``` #### Tool Result Part ```typescript interface LanguageModelV1ToolResultPart { type: 'tool-result'; toolCallId: string; toolName: string; result: unknown; isError?: boolean; content?: Array<{ type: 'text'; text: string } | { type: 'image'; data: string; mimeType?: string }>; providerMetadata?: LanguageModelV1ProviderMetadata; } ``` ### 6. Tool-Related Types #### Function Tool Definition ```typescript export type LanguageModelV1FunctionTool = { type: 'function'; name: string; description?: string; parameters: JSONSchema7; }; ``` #### Tool Call Result ```typescript export type LanguageModelV1FunctionToolCall = { toolCallType: 'function'; toolCallId: string; toolName: string; args: string; // Stringified JSON }; ``` #### Tool Choice ```typescript export type LanguageModelV1ToolChoice = | { type: 'auto' } | { type: 'none' } | { type: 'required' } | { type: 'tool'; toolName: string }; ``` ### 7. Result Types #### Finish Reason ```typescript export type LanguageModelV1FinishReason = | 'stop' // model generated stop sequence | 'length' // model generated maximum number of tokens | 'content-filter' // content filter violation stopped the model | 'tool-calls' // model triggered tool calls | 'error' // model stopped because of an error | 'other' // model stopped for other reasons | 'unknown'; // the model has not transmitted a finish reason ``` #### Call Warning ```typescript export type LanguageModelV1CallWarning = | { type: 'unsupported-setting'; setting: 'temperature' | 'maxTokens' | 'topP' | 'topK' | 'presencePenalty' | 'frequencyPenalty' | 'stopSequences' | 'seed'; details?: string; } | { type: 'other'; message: string; }; ``` ## Implementation Pattern Based on the Claude Code provider example, here's the typical implementation pattern: 1. **Parse and validate options** - Extract settings from `LanguageModelV1CallOptions` - Validate model parameters - Generate warnings for unsupported settings 2. **Convert prompt to provider format** - Transform `LanguageModelV1Prompt` to provider-specific format - Handle different message roles and content types - Process multimodal content (images, files) 3. **Call the underlying API** - Use provider SDK/API with converted prompt - Handle abort signals - Manage authentication and errors 4. **Process the response** - Extract text, tool calls, and other content - Calculate token usage - Determine finish reason - For object-json mode, extract and validate JSON 5. **Return standardized result** - Include all required fields (text, usage, finishReason, rawCall) - Add optional fields as available (toolCalls, warnings, providerMetadata) - Provide debugging information (rawResponse, request) ## Key Considerations 1. **Error Handling**: Use `@ai-sdk/provider` error types like `APICallError`, `NoSuchModelError`, `LoadAPIKeyError` 2. **Abort Signal**: Properly handle `options.abortSignal` for cancellation 3. **Mode Handling**: - `regular`: Standard text generation with optional tools - `object-json`: JSON generation mode (extract JSON from response) - `object-tool`: Tool-based object generation 4. **Warnings**: Generate warnings for unsupported parameters or validation issues 5. **Provider Metadata**: Pass through provider-specific data that doesn't fit standard fields 6. **Raw Data**: Include raw prompt/settings in `rawCall` for debugging and observability This summary provides the essential types and patterns needed to implement a compliant `doGenerate` method for the Vercel AI SDK Language Model V1 interface.