@tanstack/ai

Version:

Core TanStack AI library - Open source AI SDK

github.com/TanStack/ai

TanStack/ai

103 lines (86 loc) • 3.13 kB

Markdown

# Gemini Adapter Reference ## Package ``` @tanstack/ai-gemini ``` ## Adapter Factories | Factory | Type | Description | | ----------------- | --------- | ----------------------------- | | `geminiText` | Text/Chat | Chat completions | | `geminiImage` | Image | Image generation (Imagen) | | `geminiSpeech` | TTS | Text-to-speech (experimental) | | `geminiSummarize` | Summarize | Text summarization | ## Import ```typescript import { geminiText } from '@tanstack/ai-gemini' import { geminiImage } from '@tanstack/ai-gemini' ``` ## Key Chat Models | Model | Max Input | Max Output | Notes | | ------------------------------- | --------- | ---------- | ---------------------------- | | `gemini-3.1-pro-preview` | 1M | 65K | Latest flagship, thinking | | `gemini-3-pro-preview` | 1M | 65K | Previous flagship | | `gemini-3-flash-preview` | 1M | 65K | Fast, thinking, multimodal | | `gemini-3.1-flash-lite-preview` | 1M | 65K | Budget, still capable | | `gemini-2.5-pro` | 1M | 65K | Stable release, all features | | `gemini-2.5-flash` | 1M | 65K | Fast stable release | All Gemini text models accept `text`, `image`, `audio`, `video`, and `document` input. ## Provider-Specific modelOptions ```typescript chat({ adapter: geminiText('gemini-2.5-pro'), messages, modelOptions: { // Thinking (budget-based) thinkingConfig: { includeThoughts: true, thinkingBudget: 4096, }, // Thinking (level-based, advanced models) thinkingConfig: { thinkingLevel: 'THINKING_LEVEL_HIGH', }, // Safety settings safetySettings: [ { category: 'HARM_CATEGORY_HATE_SPEECH', threshold: 'BLOCK_MEDIUM_AND_ABOVE', }, ], // Tool config toolConfig: { /* ToolConfig */ }, // Structured output responseMimeType: 'application/json', responseSchema: { /* Schema */ }, // Cached content cachedContent: 'cachedContents/abc123', // Response modalities responseModalities: ['TEXT'], // Sampling topK: 40, seed: 42, presencePenalty: 0.5, frequencyPenalty: 0.5, candidateCount: 1, stopSequences: ['END'], }, }) ``` ## Environment Variable ``` GOOGLE_API_KEY (preferred) GEMINI_API_KEY (also accepted) ``` The adapter checks `GOOGLE_API_KEY` first, then falls back to `GEMINI_API_KEY`. Note: `GOOGLE_GENAI_API_KEY` does NOT work. ## Gotchas - All Gemini models are multimodal (text, image, audio, video, document input). - Image generation models (`gemini-3-pro-image-preview`, etc.) have smaller input limits (65K tokens) compared to text models (1M tokens). - `thinkingConfig.thinkingLevel` (level-based) and `thinkingConfig.thinkingBudget` (budget-based) serve different models. Check which your model supports. - `cachedContent` must follow the format `cachedContents/{id}`.