@tanstack/ai
Version:
Core TanStack AI library - Open source AI SDK
103 lines (86 loc) • 3.13 kB
Markdown
# Gemini Adapter Reference
## Package
```
@tanstack/ai-gemini
```
## Adapter Factories
| Factory | Type | Description |
| ----------------- | --------- | ----------------------------- |
| `geminiText` | Text/Chat | Chat completions |
| `geminiImage` | Image | Image generation (Imagen) |
| `geminiSpeech` | TTS | Text-to-speech (experimental) |
| `geminiSummarize` | Summarize | Text summarization |
## Import
```typescript
import { geminiText } from '@tanstack/ai-gemini'
import { geminiImage } from '@tanstack/ai-gemini'
```
## Key Chat Models
| Model | Max Input | Max Output | Notes |
| ------------------------------- | --------- | ---------- | ---------------------------- |
| `gemini-3.1-pro-preview` | 1M | 65K | Latest flagship, thinking |
| `gemini-3-pro-preview` | 1M | 65K | Previous flagship |
| `gemini-3-flash-preview` | 1M | 65K | Fast, thinking, multimodal |
| `gemini-3.1-flash-lite-preview` | 1M | 65K | Budget, still capable |
| `gemini-2.5-pro` | 1M | 65K | Stable release, all features |
| `gemini-2.5-flash` | 1M | 65K | Fast stable release |
All Gemini text models accept `text`, `image`, `audio`, `video`, and `document` input.
## Provider-Specific modelOptions
```typescript
chat({
adapter: geminiText('gemini-2.5-pro'),
messages,
modelOptions: {
// Thinking (budget-based)
thinkingConfig: {
includeThoughts: true,
thinkingBudget: 4096,
},
// Thinking (level-based, advanced models)
thinkingConfig: {
thinkingLevel: 'THINKING_LEVEL_HIGH',
},
// Safety settings
safetySettings: [
{
category: 'HARM_CATEGORY_HATE_SPEECH',
threshold: 'BLOCK_MEDIUM_AND_ABOVE',
},
],
// Tool config
toolConfig: {
/* ToolConfig */
},
// Structured output
responseMimeType: 'application/json',
responseSchema: {
/* Schema */
},
// Cached content
cachedContent: 'cachedContents/abc123',
// Response modalities
responseModalities: ['TEXT'],
// Sampling
topK: 40,
seed: 42,
presencePenalty: 0.5,
frequencyPenalty: 0.5,
candidateCount: 1,
stopSequences: ['END'],
},
})
```
## Environment Variable
```
GOOGLE_API_KEY (preferred)
GEMINI_API_KEY (also accepted)
```
The adapter checks `GOOGLE_API_KEY` first, then falls back to `GEMINI_API_KEY`.
Note: `GOOGLE_GENAI_API_KEY` does NOT work.
## Gotchas
- All Gemini models are multimodal (text, image, audio, video, document input).
- Image generation models (`gemini-3-pro-image-preview`, etc.) have smaller
input limits (65K tokens) compared to text models (1M tokens).
- `thinkingConfig.thinkingLevel` (level-based) and `thinkingConfig.thinkingBudget`
(budget-based) serve different models. Check which your model supports.
- `cachedContent` must follow the format `cachedContents/{id}`.