@ai-sdk/google
Version:
The **[Google Generative AI provider](https://ai-sdk.dev/providers/ai-sdk-providers/google-generative-ai)** for the [AI SDK](https://ai-sdk.dev/docs) contains language model support for the [Google Generative AI](https://ai.google/discover/generativeai/)
1,489 lines (1,160 loc) • 61.7 kB
text/mdx
---
title: Google Generative AI
description: Learn how to use Google Generative AI Provider.
---
# Google Generative AI Provider
The [Google Generative AI](https://ai.google.dev) provider contains language and embedding model support for
the [Google Generative AI](https://ai.google.dev/api/rest) APIs.
## Setup
The Google provider is available in the `@ai-sdk/google` module. You can install it with
<Tabs items={['pnpm', 'npm', 'yarn', 'bun']}>
<Tab>
<Snippet text="pnpm add @ai-sdk/google" dark />
</Tab>
<Tab>
<Snippet text="npm install @ai-sdk/google" dark />
</Tab>
<Tab>
<Snippet text="yarn add @ai-sdk/google" dark />
</Tab>
<Tab>
<Snippet text="bun add @ai-sdk/google" dark />
</Tab>
</Tabs>
## Provider Instance
You can import the default provider instance `google` from `@ai-sdk/google`:
```ts
import { google } from '@ai-sdk/google';
```
If you need a customized setup, you can import `createGoogleGenerativeAI` from `@ai-sdk/google` and create a provider instance with your settings:
```ts
import { createGoogleGenerativeAI } from '@ai-sdk/google';
const google = createGoogleGenerativeAI({
// custom settings
});
```
You can use the following optional settings to customize the Google Generative AI provider instance:
- **baseURL** _string_
Use a different URL prefix for API calls, e.g. to use proxy servers.
The default prefix is `https://generativelanguage.googleapis.com/v1beta`.
- **apiKey** _string_
API key that is being sent using the `x-goog-api-key` header.
It defaults to the `GOOGLE_GENERATIVE_AI_API_KEY` environment variable.
- **headers** _Record<string,string>_
Custom headers to include in the requests.
- **fetch** _(input: RequestInfo, init?: RequestInit) => Promise<Response>_
Custom [fetch](https://developer.mozilla.org/en-US/docs/Web/API/fetch) implementation.
Defaults to the global `fetch` function.
You can use it as a middleware to intercept requests,
or to provide a custom fetch implementation for e.g. testing.
- **generateId** _() => string_
Optional function to generate unique IDs for each request.
Defaults to the SDK's built-in ID generator.
- **name** _string_
Custom provider name.
Defaults to `'google.generative-ai'`.
## Language Models
You can create models that call the [Google Generative AI API](https://ai.google.dev/api/rest) using the provider instance.
The first argument is the model id, e.g. `gemini-2.5-flash`.
The models support tool calls and some have multi-modal capabilities.
```ts
const model = google('gemini-2.5-flash');
```
You can use Google Generative AI language models to generate text with the `generateText` function:
```ts
import { google } from '@ai-sdk/google';
import { generateText } from 'ai';
const { text } = await generateText({
model: google('gemini-2.5-flash'),
prompt: 'Write a vegetarian lasagna recipe for 4 people.',
});
```
Google Generative AI language models can also be used in the `streamText` function
and support structured data generation with [`Output`](/docs/reference/ai-sdk-core/output)
(see [AI SDK Core](/docs/ai-sdk-core)).
Google Generative AI also supports some model specific settings that are not part of the [standard call settings](/docs/ai-sdk-core/settings).
You can pass them as an options argument:
```ts
import { google, type GoogleLanguageModelOptions } from '@ai-sdk/google';
const model = google('gemini-2.5-flash');
await generateText({
model,
providerOptions: {
google: {
safetySettings: [
{
category: 'HARM_CATEGORY_UNSPECIFIED',
threshold: 'BLOCK_LOW_AND_ABOVE',
},
],
} satisfies GoogleLanguageModelOptions,
},
});
```
The following optional provider options are available for Google Generative AI models:
- **cachedContent** _string_
Optional. The name of the cached content used as context to serve the prediction.
Format: cachedContents/\{cachedContent\}
- **structuredOutputs** _boolean_
Optional. Enable structured output. Default is true.
This is useful when the JSON Schema contains elements that are
not supported by the OpenAPI schema version that
Google Generative AI uses. You can use this to disable
structured outputs if you need to.
See [Troubleshooting: Schema Limitations](#schema-limitations) for more details.
- **safetySettings** _Array\<\{ category: string; threshold: string \}\>_
Optional. Safety settings for the model.
- **category** _string_
The category of the safety setting. Can be one of the following:
- `HARM_CATEGORY_UNSPECIFIED`
- `HARM_CATEGORY_HATE_SPEECH`
- `HARM_CATEGORY_DANGEROUS_CONTENT`
- `HARM_CATEGORY_HARASSMENT`
- `HARM_CATEGORY_SEXUALLY_EXPLICIT`
- `HARM_CATEGORY_CIVIC_INTEGRITY`
- **threshold** _string_
The threshold of the safety setting. Can be one of the following:
- `HARM_BLOCK_THRESHOLD_UNSPECIFIED`
- `BLOCK_LOW_AND_ABOVE`
- `BLOCK_MEDIUM_AND_ABOVE`
- `BLOCK_ONLY_HIGH`
- `BLOCK_NONE`
- `OFF`
- **responseModalities** _string[]_
The modalities to use for the response. The following modalities are supported: `TEXT`, `IMAGE`. When not defined or empty, the model defaults to returning only text.
- **thinkingConfig** _\{ thinkingLevel?: 'minimal' | 'low' | 'medium' | 'high'; thinkingBudget?: number; includeThoughts?: boolean \}_
Optional. Configuration for the model's thinking process. Only supported by specific [Google Generative AI models](https://ai.google.dev/gemini-api/docs/thinking).
- **thinkingLevel** _'minimal' | 'low' | 'medium' | 'high'_
Optional. Controls the thinking depth for Gemini 3 models. Gemini 3.1 Pro supports 'low', 'medium', and 'high', Gemini 3 Pro supports 'low' and 'high', while Gemini 3 Flash supports all four levels: 'minimal', 'low', 'medium', and 'high'. Only supported by Gemini 3 models.
- **thinkingBudget** _number_
Optional. Gives the model guidance on the number of thinking tokens it can use when generating a response. Setting it to 0 disables thinking, if the model supports it.
For more information about the possible value ranges for each model see [Google Generative AI thinking documentation](https://ai.google.dev/gemini-api/docs/thinking#set-budget).
<Note>
This option is for Gemini 2.5 models. Gemini 3 models should use
`thinkingLevel` instead.
</Note>
- **includeThoughts** _boolean_
Optional. If set to true, thought summaries are returned, which are synthesized versions of the model's raw thoughts and offer insights into the model's internal reasoning process.
- **imageConfig** _\{ aspectRatio?: string, imageSize?: string \}_
Optional. Configuration for the models image generation. Only supported by specific [Google Generative AI models](https://ai.google.dev/gemini-api/docs/image-generation).
- **aspectRatio** _string_
Model defaults to generate 1:1 squares, or to matching the output image size to that of your input image. Can be one of the following:
- 1:1
- 2:3
- 3:2
- 3:4
- 4:3
- 4:5
- 5:4
- 9:16
- 16:9
- 21:9
- **imageSize** _string_
Controls the output image resolution. Defaults to 1K. Can be one of the following:
- 1K
- 2K
- 4K
- **audioTimestamp** _boolean_
Optional. Enables timestamp understanding for audio-only files.
See [Google Cloud audio understanding documentation](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/audio-understanding).
- **mediaResolution** _string_
Optional. If specified, the media resolution specified will be used. Can be one of the following:
- `MEDIA_RESOLUTION_UNSPECIFIED`
- `MEDIA_RESOLUTION_LOW`
- `MEDIA_RESOLUTION_MEDIUM`
- `MEDIA_RESOLUTION_HIGH`
See [Google API MediaResolution documentation](https://ai.google.dev/api/generate-content#MediaResolution).
- **labels** _Record<string, string>_
Optional. Defines labels used in billing reports. Available on Vertex AI only.
See [Google Cloud labels documentation](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/add-labels-to-api-calls).
- **serviceTier** _'standard' | 'flex' | 'priority'_
Optional. The service tier to use for the request.
Set to 'flex' for 50% cheaper processing at the cost of increased latency.
Set to 'priority' for ultra-low latency at a 75-100% price premium over 'standard'.
- **threshold** _string_
Optional. Standalone threshold setting that can be used independently of `safetySettings`.
Uses the same values as the `safetySettings` threshold.
### Thinking
The Gemini 2.5 and Gemini 3 series models use an internal "thinking process" that significantly improves their reasoning and multi-step planning abilities, making them highly effective for complex tasks such as coding, advanced mathematics, and data analysis. For more information see [Google Generative AI thinking documentation](https://ai.google.dev/gemini-api/docs/thinking).
#### Gemini 3 Models
For Gemini 3 models, use the `thinkingLevel` parameter to control the depth of reasoning:
```ts
import { google, GoogleLanguageModelOptions } from '@ai-sdk/google';
import { generateText } from 'ai';
const model = google('gemini-3.1-pro-preview');
const { text, reasoning } = await generateText({
model: model,
prompt: 'What is the sum of the first 10 prime numbers?',
providerOptions: {
google: {
thinkingConfig: {
thinkingLevel: 'high',
includeThoughts: true,
},
} satisfies GoogleLanguageModelOptions,
},
});
console.log(text);
console.log(reasoning); // Reasoning summary
```
#### Gemini 2.5 Models
For Gemini 2.5 models, use the `thinkingBudget` parameter to control the number of thinking tokens:
```ts
import { google, GoogleLanguageModelOptions } from '@ai-sdk/google';
import { generateText } from 'ai';
const model = google('gemini-2.5-flash');
const { text, reasoning } = await generateText({
model: model,
prompt: 'What is the sum of the first 10 prime numbers?',
providerOptions: {
google: {
thinkingConfig: {
thinkingBudget: 8192,
includeThoughts: true,
},
} satisfies GoogleLanguageModelOptions,
},
});
console.log(text);
console.log(reasoning); // Reasoning summary
```
### File Inputs
The Google Generative AI provider supports file inputs, e.g. PDF files.
```ts
import { google } from '@ai-sdk/google';
import { generateText } from 'ai';
const result = await generateText({
model: google('gemini-2.5-flash'),
messages: [
{
role: 'user',
content: [
{
type: 'text',
text: 'What is an embedding model according to this document?',
},
{
type: 'file',
data: fs.readFileSync('./data/ai.pdf'),
mediaType: 'application/pdf',
},
],
},
],
});
```
You can also use YouTube URLs directly:
```ts
import { google } from '@ai-sdk/google';
import { generateText } from 'ai';
const result = await generateText({
model: google('gemini-2.5-flash'),
messages: [
{
role: 'user',
content: [
{
type: 'text',
text: 'Summarize this video',
},
{
type: 'file',
data: 'https://www.youtube.com/watch?v=dQw4w9WgXcQ',
mediaType: 'video/mp4',
},
],
},
],
});
```
<Note>
The AI SDK will automatically download URLs if you pass them as data, except
for `https://generativelanguage.googleapis.com/v1beta/files/` and YouTube
URLs. You can use the Google Generative AI Files API to upload larger files to
that location. YouTube URLs (public or unlisted videos) are supported directly
- you can specify one YouTube video URL per request.
</Note>
See [File Parts](/docs/foundations/prompts#file-parts) for details on how to use files in prompts.
### Cached Content
Google Generative AI supports both explicit and implicit caching to help reduce costs on repetitive content.
#### Implicit Caching
Gemini 2.5 models automatically provide cache cost savings without needing to create an explicit cache. When you send requests that share common prefixes with previous requests, you'll receive a 75% token discount on cached content.
To maximize cache hits with implicit caching:
- Keep content at the beginning of requests consistent
- Add variable content (like user questions) at the end of prompts
- Ensure requests meet minimum token requirements:
- Gemini 2.5 Flash: 1024 tokens minimum
- Gemini 2.5 Pro: 2048 tokens minimum
```ts
import { google } from '@ai-sdk/google';
import { generateText } from 'ai';
// Structure prompts with consistent content at the beginning
const baseContext =
'You are a cooking assistant with expertise in Italian cuisine. Here are 1000 lasagna recipes for reference...';
const { text: veggieLasagna } = await generateText({
model: google('gemini-2.5-pro'),
prompt: `${baseContext}\n\nWrite a vegetarian lasagna recipe for 4 people.`,
});
// Second request with same prefix - eligible for cache hit
const { text: meatLasagna, providerMetadata } = await generateText({
model: google('gemini-2.5-pro'),
prompt: `${baseContext}\n\nWrite a meat lasagna recipe for 12 people.`,
});
// Check cached token count in usage metadata
console.log('Cached tokens:', providerMetadata.google);
// e.g.
// {
// groundingMetadata: null,
// safetyRatings: null,
// usageMetadata: {
// cachedContentTokenCount: 2027,
// thoughtsTokenCount: 702,
// promptTokenCount: 2152,
// candidatesTokenCount: 710,
// totalTokenCount: 3564
// }
// }
```
<Note>
Usage metadata was added to `providerMetadata` in `@ai-sdk/google@1.2.23`. If
you are using an older version, usage metadata is available in the raw HTTP
`response` body returned as part of the return value from `generateText`.
</Note>
#### Explicit Caching
For guaranteed cost savings, you can still use explicit caching with Gemini 2.5 and 2.0 models. See the [models page](https://ai.google.dev/gemini-api/docs/models) to check if caching is supported for the used model:
```ts
import { google, type GoogleLanguageModelOptions } from '@ai-sdk/google';
import { GoogleGenAI } from '@google/genai';
import { generateText } from 'ai';
const ai = new GoogleGenAI({
apiKey: process.env.GOOGLE_GENERATIVE_AI_API_KEY,
});
const model = 'gemini-2.5-pro';
// Create a cache with the content you want to reuse
const cache = await ai.caches.create({
model,
config: {
contents: [
{
role: 'user',
parts: [{ text: '1000 Lasagna Recipes...' }],
},
],
ttl: '300s', // Cache expires after 5 minutes
},
});
const { text: veggieLasagnaRecipe } = await generateText({
model: google(model),
prompt: 'Write a vegetarian lasagna recipe for 4 people.',
providerOptions: {
google: {
cachedContent: cache.name,
} satisfies GoogleLanguageModelOptions,
},
});
const { text: meatLasagnaRecipe } = await generateText({
model: google(model),
prompt: 'Write a meat lasagna recipe for 12 people.',
providerOptions: {
google: {
cachedContent: cache.name,
} satisfies GoogleLanguageModelOptions,
},
});
```
### Code Execution
With [Code Execution](https://ai.google.dev/gemini-api/docs/code-execution), certain models can generate and execute Python code to perform calculations, solve problems, or provide more accurate information.
You can enable code execution by adding the `code_execution` tool to your request.
```ts
import { google } from '@ai-sdk/google';
import { googleTools } from '@ai-sdk/google/internal';
import { generateText } from 'ai';
const { text, toolCalls, toolResults } = await generateText({
model: google('gemini-2.5-pro'),
tools: { code_execution: google.tools.codeExecution({}) },
prompt: 'Use python to calculate the 20th fibonacci number.',
});
```
The response will contain the tool calls and results from the code execution.
### Google Search
With [Google Search grounding](https://ai.google.dev/gemini-api/docs/google-search),
the model has access to the latest information using Google Search.
```ts highlight="8,17-20"
import { google } from '@ai-sdk/google';
import { GoogleGenerativeAIProviderMetadata } from '@ai-sdk/google';
import { generateText } from 'ai';
const { text, sources, providerMetadata } = await generateText({
model: google('gemini-2.5-flash'),
tools: {
google_search: google.tools.googleSearch({}),
},
prompt:
'List the top 5 San Francisco news from the past week.' +
'You must include the date of each article.',
});
// access the grounding metadata. Casting to the provider metadata type
// is optional but provides autocomplete and type safety.
const metadata = providerMetadata?.google as
| GoogleGenerativeAIProviderMetadata
| undefined;
const groundingMetadata = metadata?.groundingMetadata;
const safetyRatings = metadata?.safetyRatings;
```
The `googleSearch` tool accepts the following optional configuration options:
- **searchTypes** _object_
Enables specific search types. Both can be combined.
- `webSearch`: Enable web search grounding (pass `{}` to enable). This is the default.
- `imageSearch`: Enable [image search grounding](https://ai.google.dev/gemini-api/docs/image-generation#image-search) (pass `{}` to enable).
- **timeRangeFilter** _object_
Restricts search results to a specific time range. Both `startTime` and `endTime` are required.
- `startTime`: Start time in ISO 8601 format (e.g. `'2025-01-01T00:00:00Z'`).
- `endTime`: End time in ISO 8601 format (e.g. `'2025-12-31T23:59:59Z'`).
```ts
google.tools.googleSearch({
searchTypes: { webSearch: {} },
timeRangeFilter: {
startTime: '2025-01-01T00:00:00Z',
endTime: '2025-12-31T23:59:59Z',
},
});
```
When Google Search grounding is enabled, the model will include sources in the response.
Additionally, the grounding metadata includes detailed information about how search results were used to ground the model's response. Here are the available fields:
- **`webSearchQueries`** (`string[] | null`)
- Array of search queries used to retrieve information
- Example: `["What's the weather in Chicago this weekend?"]`
- **`searchEntryPoint`** (`{ renderedContent: string } | null`)
- Contains the main search result content used as an entry point
- The `renderedContent` field contains the formatted content
- **`groundingSupports`** (Array of support objects | null)
- Contains details about how specific response parts are supported by search results
- Each support object includes:
- **`segment`**: Information about the grounded text segment
- `text`: The actual text segment
- `startIndex`: Starting position in the response
- `endIndex`: Ending position in the response
- **`groundingChunkIndices`**: References to supporting search result chunks
- **`confidenceScores`**: Confidence scores (0-1) for each supporting chunk
Example response:
```json
{
"groundingMetadata": {
"webSearchQueries": ["What's the weather in Chicago this weekend?"],
"searchEntryPoint": {
"renderedContent": "..."
},
"groundingSupports": [
{
"segment": {
"startIndex": 0,
"endIndex": 65,
"text": "Chicago weather changes rapidly, so layers let you adjust easily."
},
"groundingChunkIndices": [0],
"confidenceScores": [0.99]
}
]
}
}
```
### Enterprise Web Search
With [Enterprise Web Search](https://cloud.google.com/vertex-ai/generative-ai/docs/grounding/web-grounding-enterprise),
the model has access to a compliance-focused web index designed for highly-regulated industries such as finance, healthcare, and public sector.
<Note>
Enterprise Web Search is only available on Vertex AI. You must use the Google
Vertex provider (`@ai-sdk/google-vertex`) instead of the standard Google
provider (`@ai-sdk/google`) to use this feature. Requires Gemini 2.0 or newer
models.
</Note>
```ts
import { createVertex } from '@ai-sdk/google-vertex';
import { generateText } from 'ai';
const vertex = createVertex({
project: 'my-project',
location: 'us-central1',
});
const { text, sources, providerMetadata } = await generateText({
model: vertex('gemini-2.5-flash'),
tools: {
enterprise_web_search: vertex.tools.enterpriseWebSearch({}),
},
prompt: 'What are the latest regulatory updates for financial services?',
});
```
Enterprise Web Search provides the following benefits:
- Does not log customer data
- Supports VPC service controls
- Compliance-focused web index for regulated industries
### File Search
The [File Search tool](https://ai.google.dev/gemini-api/docs/file-search) lets Gemini retrieve context from your own documents that you have indexed in File Search stores. Only Gemini 2.5 and Gemini 3 models support this feature.
```ts highlight="9-13"
import { google } from '@ai-sdk/google';
import { generateText } from 'ai';
const { text, sources } = await generateText({
model: google('gemini-2.5-pro'),
tools: {
file_search: google.tools.fileSearch({
fileSearchStoreNames: [
'projects/my-project/locations/us/fileSearchStores/my-store',
],
metadataFilter: 'author = "Robert Graves"',
topK: 8,
}),
},
prompt: "Summarise the key themes of 'I, Claudius'.",
});
```
File Search responses include citations via the normal `sources` field and expose raw [grounding metadata](#google-search) in `providerMetadata.google.groundingMetadata`.
### URL Context
Google provides a provider-defined URL context tool.
The URL context tool allows you to provide specific URLs that you want the model to analyze directly in from the prompt.
```ts highlight="9,13-17"
import { google } from '@ai-sdk/google';
import { generateText } from 'ai';
const { text, sources, providerMetadata } = await generateText({
model: google('gemini-2.5-flash'),
prompt: `Based on the document: https://ai.google.dev/gemini-api/docs/url-context.
Answer this question: How many links we can consume in one request?`,
tools: {
url_context: google.tools.urlContext({}),
},
});
const metadata = providerMetadata?.google as
| GoogleGenerativeAIProviderMetadata
| undefined;
const groundingMetadata = metadata?.groundingMetadata;
const urlContextMetadata = metadata?.urlContextMetadata;
```
The URL context metadata includes detailed information about how the model used the URL context to generate the response. Here are the available fields:
- **`urlMetadata`** (`{ retrievedUrl: string; urlRetrievalStatus: string; }[] | null`)
- Array of URL context metadata
- Each object includes:
- **`retrievedUrl`**: The URL of the context
- **`urlRetrievalStatus`**: The status of the URL retrieval
Example response:
```json
{
"urlMetadata": [
{
"retrievedUrl": "https://ai-sdk.dev/providers/ai-sdk-providers/google-generative-ai",
"urlRetrievalStatus": "URL_RETRIEVAL_STATUS_SUCCESS"
}
]
}
```
With the URL context tool, you will also get the `groundingMetadata`.
```json
"groundingMetadata": {
"groundingChunks": [
{
"web": {
"uri": "https://ai-sdk.dev/providers/ai-sdk-providers/google-generative-ai",
"title": "Google Generative AI - AI SDK Providers"
}
}
],
"groundingSupports": [
{
"segment": {
"startIndex": 67,
"endIndex": 157,
"text": "**Installation**: Install the `@ai-sdk/google` module using your preferred package manager"
},
"groundingChunkIndices": [
0
]
},
]
}
```
<Note>You can add up to 20 URLs per request.</Note>
<Note>
The URL context tool is only supported for Gemini 2.0 Flash models and above.
Check the [supported models for URL context
tool](https://ai.google.dev/gemini-api/docs/url-context#supported-models).
</Note>
#### Combine URL Context with Search Grounding
You can combine the URL context tool with search grounding to provide the model with the latest information from the web.
```ts highlight="9-10"
import { google } from '@ai-sdk/google';
import { generateText } from 'ai';
const { text, sources, providerMetadata } = await generateText({
model: google('gemini-2.5-flash'),
prompt: `Based on this context: https://ai-sdk.dev/providers/ai-sdk-providers/google-generative-ai, tell me how to use Gemini with AI SDK.
Also, provide the latest news about AI SDK V5.`,
tools: {
google_search: google.tools.googleSearch({}),
url_context: google.tools.urlContext({}),
},
});
const metadata = providerMetadata?.google as
| GoogleGenerativeAIProviderMetadata
| undefined;
const groundingMetadata = metadata?.groundingMetadata;
const urlContextMetadata = metadata?.urlContextMetadata;
```
### Google Maps Grounding
With [Google Maps grounding](https://ai.google.dev/gemini-api/docs/maps-grounding),
the model has access to Google Maps data for location-aware responses. This enables providing local data and geospatial context, such as finding nearby restaurants.
```ts highlight="7-16"
import { google, type GoogleLanguageModelOptions } from '@ai-sdk/google';
import { GoogleGenerativeAIProviderMetadata } from '@ai-sdk/google';
import { generateText } from 'ai';
const { text, sources, providerMetadata } = await generateText({
model: google('gemini-2.5-flash'),
tools: {
google_maps: google.tools.googleMaps({}),
},
providerOptions: {
google: {
retrievalConfig: {
latLng: { latitude: 34.090199, longitude: -117.881081 },
},
} satisfies GoogleLanguageModelOptions,
},
prompt:
'What are the best Italian restaurants within a 15-minute walk from here?',
});
const metadata = providerMetadata?.google as
| GoogleGenerativeAIProviderMetadata
| undefined;
const groundingMetadata = metadata?.groundingMetadata;
```
The optional `retrievalConfig.latLng` provider option provides location context for queries about nearby places. This configuration applies to any grounding tools that support location context, including Google Maps and Google Search.
When Google Maps grounding is enabled, the model's response will include sources pointing to Google Maps URLs. The grounding metadata includes `maps` chunks with place information:
```json
{
"groundingMetadata": {
"groundingChunks": [
{
"maps": {
"uri": "https://maps.google.com/?cid=12345",
"title": "Restaurant Name",
"placeId": "places/ChIJ..."
}
}
]
}
}
```
<Note>Google Maps grounding is supported on Gemini 2.0 and newer models.</Note>
### RAG Engine Grounding
With [RAG Engine Grounding](https://cloud.google.com/vertex-ai/generative-ai/docs/rag-engine/use-vertexai-search#generate-content-using-gemini-api),
the model has access to your custom knowledge base using the Vertex RAG Engine.
This enables the model to provide answers based on your specific data sources and documents.
<Note>
RAG Engine Grounding is only supported with Vertex Gemini models. You must use
the Google Vertex provider (`@ai-sdk/google-vertex`) instead of the standard
Google provider (`@ai-sdk/google`) to use this feature.
</Note>
```ts highlight="8,17-20"
import { createVertex } from '@ai-sdk/google-vertex';
import { GoogleGenerativeAIProviderMetadata } from '@ai-sdk/google';
import { generateText } from 'ai';
const vertex = createVertex({
project: 'my-project',
location: 'us-central1',
});
const { text, sources, providerMetadata } = await generateText({
model: vertex('gemini-2.5-flash'),
tools: {
vertex_rag_store: vertex.tools.vertexRagStore({
ragCorpus:
'projects/my-project/locations/us-central1/ragCorpora/my-rag-corpus',
topK: 5,
}),
},
prompt:
'What are the key features of our product according to our documentation?',
});
// access the grounding metadata. Casting to the provider metadata type
// is optional but provides autocomplete and type safety.
const metadata = providerMetadata?.google as
| GoogleGenerativeAIProviderMetadata
| undefined;
const groundingMetadata = metadata?.groundingMetadata;
const safetyRatings = metadata?.safetyRatings;
```
When RAG Engine Grounding is enabled, the model will include sources from your RAG corpus in the response.
Additionally, the grounding metadata includes detailed information about how RAG results were used to ground the model's response. Here are the available fields:
- **`groundingChunks`** (Array of chunk objects | null)
- Contains the retrieved context chunks from your RAG corpus
- Each chunk includes:
- **`retrievedContext`**: Information about the retrieved context
- `uri`: The URI or identifier of the source document
- `title`: The title of the source document (optional)
- `text`: The actual text content of the chunk
- **`groundingSupports`** (Array of support objects | null)
- Contains details about how specific response parts are supported by RAG results
- Each support object includes:
- **`segment`**: Information about the grounded text segment
- `text`: The actual text segment
- `startIndex`: Starting position in the response
- `endIndex`: Ending position in the response
- **`groundingChunkIndices`**: References to supporting RAG result chunks
- **`confidenceScores`**: Confidence scores (0-1) for each supporting chunk
Example response:
```json
{
"groundingMetadata": {
"groundingChunks": [
{
"retrievedContext": {
"uri": "gs://my-bucket/docs/product-guide.pdf",
"title": "Product User Guide",
"text": "Our product includes advanced AI capabilities, real-time processing, and enterprise-grade security features."
}
}
],
"groundingSupports": [
{
"segment": {
"startIndex": 0,
"endIndex": 45,
"text": "Our product includes advanced AI capabilities and real-time processing."
},
"groundingChunkIndices": [0],
"confidenceScores": [0.95]
}
]
}
}
```
#### Configuration Options
The `vertexRagStore` tool accepts the following configuration options:
- **`ragCorpus`** (`string`, required)
- The RagCorpus resource name in the format: `projects/{project}/locations/{location}/ragCorpora/{rag_corpus}`
- This identifies your specific RAG corpus to search against
- **`topK`** (`number`, optional)
- The number of top contexts to retrieve from your RAG corpus
- Defaults to the corpus configuration if not specified
### Image Outputs
Gemini models with image generation capabilities (e.g. `gemini-2.5-flash-image`) support generating images as part of a multimodal response. Images are exposed as files in the response.
```ts
import { google } from '@ai-sdk/google';
import { generateText } from 'ai';
const result = await generateText({
model: google('gemini-2.5-flash-image'),
prompt:
'Create a picture of a nano banana dish in a fancy restaurant with a Gemini theme',
});
for (const file of result.files) {
if (file.mediaType.startsWith('image/')) {
console.log('Generated image:', file);
}
}
```
<Note>
If you primarily want to generate images without text output, you can also use
Gemini image models with the `generateImage()` function. See [Gemini Image
Models](#gemini-image-models) for details.
</Note>
### Safety Ratings
The safety ratings provide insight into the safety of the model's response.
See [Google AI documentation on safety settings](https://ai.google.dev/gemini-api/docs/safety-settings).
Example response excerpt:
```json
{
"safetyRatings": [
{
"category": "HARM_CATEGORY_HATE_SPEECH",
"probability": "NEGLIGIBLE",
"probabilityScore": 0.11027937,
"severity": "HARM_SEVERITY_LOW",
"severityScore": 0.28487435
},
{
"category": "HARM_CATEGORY_DANGEROUS_CONTENT",
"probability": "HIGH",
"blocked": true,
"probabilityScore": 0.95422274,
"severity": "HARM_SEVERITY_MEDIUM",
"severityScore": 0.43398145
},
{
"category": "HARM_CATEGORY_HARASSMENT",
"probability": "NEGLIGIBLE",
"probabilityScore": 0.11085559,
"severity": "HARM_SEVERITY_NEGLIGIBLE",
"severityScore": 0.19027223
},
{
"category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
"probability": "NEGLIGIBLE",
"probabilityScore": 0.22901751,
"severity": "HARM_SEVERITY_NEGLIGIBLE",
"severityScore": 0.09089675
}
]
}
```
### Troubleshooting
#### Schema Limitations
The Google Generative AI API uses a subset of the OpenAPI 3.0 schema,
which does not support features such as unions.
The errors that you get in this case look like this:
`GenerateContentRequest.generation_config.response_schema.properties[occupation].type: must be specified`
By default, structured outputs are enabled (and for tool calling they are required).
You can disable structured outputs for object generation as a workaround:
```ts highlight="3,8"
const { output } = await generateText({
model: google('gemini-2.5-flash'),
providerOptions: {
google: {
structuredOutputs: false,
} satisfies GoogleLanguageModelOptions,
},
output: Output.object({
schema: z.object({
name: z.string(),
age: z.number(),
contact: z.union([
z.object({
type: z.literal('email'),
value: z.string(),
}),
z.object({
type: z.literal('phone'),
value: z.string(),
}),
]),
}),
}),
prompt: 'Generate an example person for testing.',
});
```
The following Zod features are known to not work with Google Generative AI:
- `z.union`
- `z.record`
### Model Capabilities
| Model | Image Input | Object Generation | Tool Usage | Tool Streaming | Google Search | URL Context |
| ------------------------------------- | ------------------- | ------------------- | ------------------- | ------------------- | ------------------- | ------------------- |
| `gemini-3.5-flash` | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> |
| `gemini-3.1-pro-preview` | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> |
| `gemini-3.1-flash-image-preview` | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> |
| `gemini-3.1-flash-lite-preview` | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> |
| `gemini-3-pro-preview` | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> |
| `gemini-3-pro-image-preview` | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> |
| `gemini-3-flash-preview` | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> |
| `gemini-2.5-pro` | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> |
| `gemini-2.5-flash` | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> |
| `gemini-2.5-flash-lite` | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> |
| `gemini-2.5-flash-lite-preview-06-17` | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> |
| `gemini-2.0-flash` | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> |
<Note>
The table above lists popular models. Please see the [Google Generative AI
docs](https://ai.google.dev/gemini-api/docs/models/) for a full list of
available models. The table above lists popular models. You can also pass any
available provider model ID as a string if needed.
</Note>
## Interactions API
The [Gemini Interactions API](https://ai.google.dev/gemini-api/docs/interactions)
(`POST /v1beta/interactions`) is a separate Google endpoint with server-side
state, unified content blocks, first-class built-in tools, agent presets, and
native multimodal image output. It is reached via the `google.interactions(...)`
factory:
```ts
import { google } from '@ai-sdk/google';
import { generateText } from 'ai';
const { text } = await generateText({
model: google.interactions('gemini-2.5-flash'),
prompt: 'Hello, how are you?',
});
```
`google.interactions(...)` accepts either a model ID string (e.g.
`'gemini-2.5-flash'`, `'gemini-3-pro-preview'`) or `{ agent: <name> }` to use
a Gemini [agent preset](#agent-presets). The returned model can be passed to
`generateText` and `streamText` like any other AI SDK language model.
<Note>
Use `google(...)` for the standard `:generateContent` /
`:streamGenerateContent` endpoints, and `google.interactions(...)` for the
new Interactions endpoint. Pick one per model instance — they target
different request bodies and SSE event vocabularies.
</Note>
### Provider Options
The Interactions model reads its options from the shared
`providerOptions.google.*` namespace. Validate them with the
`GoogleLanguageModelInteractionsOptions` type:
```ts
import {
google,
type GoogleLanguageModelInteractionsOptions,
} from '@ai-sdk/google';
import { generateText } from 'ai';
await generateText({
model: google.interactions('gemini-2.5-flash'),
prompt: 'What color is the sky in one word?',
providerOptions: {
google: {
serviceTier: 'priority',
} satisfies GoogleLanguageModelInteractionsOptions,
},
});
```
The following optional provider options are available:
- **previousInteractionId** _string_
Server-side interaction id from a prior turn. When set, the server pulls
prior context from its own state and only the new user message is sent on
the wire. Pair with the default `store: true` to chain stateful
conversations. See [Stateful chaining](#stateful-chaining).
- **store** _boolean_
Whether the server should persist the interaction. Defaults to `true`.
Set to `false` for stateless multi-turn conversations where the full
message history is re-sent on every turn.
- **agent** _string_
Name of a Gemini agent preset (e.g. `'deep-research-pro-preview-12-2025'`).
<Note>
Prefer the factory form `google.interactions({ agent: '...' })` over
setting `agent` in provider options — the factory is type-checked
against the supported agent names.
</Note>
- **agentConfig** _object_
Per-agent configuration. Currently supports `{ type: 'dynamic' }` and
`{ type: 'deep-research', thinkingSummaries?, visualization?, collaborativePlanning? }`.
- **thinkingLevel** _'minimal' | 'low' | 'medium' | 'high'_
Controls reasoning depth for thinking-enabled models. Mapped onto the
Interactions request's `thinking_level`.
- **thinkingSummaries** _'auto' | 'none'_
Whether the model returns synthesized thought summaries on reasoning
parts. Defaults to the API default.
- **responseFormat** _Array\<\{ type: 'text' | 'image' | 'audio'; mimeType?: string; schema?: unknown; aspectRatio?: string; imageSize?: '1K' \| '2K' \| '4K' \| '512' \}\>_
Output-format entries that map directly to the API's `response_format`
array. Use this for fine-grained control over image, audio, or non-JSON
text outputs (e.g. `aspectRatio` and `imageSize` for image generation).
The AI SDK call-level `responseFormat: { type: 'json', schema }` still
drives JSON-mode automatically and prepends a matching text entry;
entries listed here are appended.
`aspectRatio` accepts `1:1`, `2:3`, `3:2`, `3:4`, `4:3`, `4:5`, `5:4`,
`9:16`, `16:9`, `21:9`, `1:8`, `8:1`, `1:4`, `4:1`.
- **imageConfig** _\{ aspectRatio?: string; imageSize?: '1K' | '2K' | '4K' | '512' \}_ (deprecated)
Use **responseFormat** with a `{ type: 'image', ... }` entry instead.
Retained for backwards compatibility; the SDK translates `imageConfig`
into a matching `response_format` image entry and emits a warning when
set. Ignored when `responseFormat` already supplies an image entry.
- **mediaResolution** _'low' | 'medium' | 'high' | 'ultra_high'_
Media resolution applied to image inputs / outputs.
- **serviceTier** _'flex' | 'standard' | 'priority'_
Service tier for the request. Mirrored back on
`result.providerMetadata.google.serviceTier` for observability.
- **systemInstruction** _string_
Alternative to the AI SDK `system` message. If both are set, the AI SDK
`system` message wins and a warning is emitted.
- **pollingTimeoutMs** _number_
Maximum time, in milliseconds, to poll a background interaction (agent
call) before giving up. Defaults to 30 minutes (1,800,000 ms). Long-running
agents such as deep research may need longer.
### Provider Metadata
`result.providerMetadata.google` (typed via `GoogleInteractionsProviderMetadata`)
exposes:
- **interactionId** _string_
Server-side interaction id. Pass this back as `previousInteractionId` on
the next turn to chain.
- **serviceTier** _string_
Service tier the request actually ran on.
- **signature** _string_
Per-block signature hash, set by the SDK on output reasoning and
tool-call parts. Round-tripped automatically on the next turn.
### Stateful chaining
With the default `store: true`, the server retains the prior turn so the
next request only needs to send the new user message and the
`previousInteractionId`:
```ts
import { google } from '@ai-sdk/google';
import { generateText } from 'ai';
const turn1 = await generateText({
model: google.interactions('gemini-2.5-flash'),
prompt: 'What are the three largest cities in Spain?',
});
const interactionId = turn1.providerMetadata?.google?.interactionId as
| string
| undefined;
const turn2 = await generateText({
model: google.interactions('gemini-2.5-flash'),
prompt: 'What is the most famous landmark in the second one?',
providerOptions: {
google: { previousInteractionId: interactionId },
},
});
```
For stateless multi-turn conversations, set `store: false` and re-send the
full message history on every turn (no `previousInteractionId`):
```ts
import { google } from '@ai-sdk/google';
import { generateText, type ModelMessage } from 'ai';
const messages: Array<ModelMessage> = [
{ role: 'user', content: 'What are the three largest cities in Spain?' },
];
const turn1 = await generateText({
model: google.interactions('gemini-2.5-flash'),
messages,
providerOptions: { google: { store: false } },
});
messages.push(...turn1.response.messages);
messages.push({
role: 'user',
content: 'What is the most famous landmark in the second one?',
});
const turn2 = await generateText({
model: google.interactions('gemini-2.5-flash'),
messages,
providerOptions: { google: { store: false } },
});
```
### Built-in Tools
The Interactions API ships a built-in tool catalog. The provider-defined
tools under `google.tools.*` map onto Interactions tool descriptors:
| AI SDK tool | Interactions tool type | Notes |
| ------------------------------ | ---------------------- | --------------------------------------------------------------------------- |
| `google.tools.googleSearch` | `google_search` | Web / image search grounding. |
| `google.tools.codeExecution` | `code_execution` | Server-side Python execution. |
| `google.tools.urlContext` | `url_context` | Fetch URLs referenced in the prompt. |
| `google.tools.fileSearch` | `file_search` | Retrieval from File Search stores. |
| `google.tools.googleMaps` | `google_maps` | Maps grounding for nearby-places queries. |
| _provider tool_ `google.computer_use` | `computer_use` | Computer use (browser environment). |
| _provider tool_ `google.mcp_server` | `mcp_server` | Remote MCP server passthrough. |
| _provider tool_ `google.retrieval` | `retrieval` | Vertex AI Search retrieval. |
Function tools (`type: 'function'`) defined with the AI SDK `tool(...)`
helper are translated to Interactions `function` tool descriptors. Other
tool kinds emit a warning and are dropped.
```ts
import { google } from '@ai-sdk/google';
import { generateText } from 'ai';
const { text, sources } = await generateText({
model: google.interactions('gemini-2.5-flash'),
tools: {
google_search: google.tools.googleSearch({}),
},
prompt:
"What's a notable AI development from this past week? " +
'Include the date for each item you mention.',
});
```
Function tools work the same way as on the standard provider:
```ts
import { google } from '@ai-sdk/google';
import { generateText, stepCountIs, tool } from 'ai';
import { z } from 'zod';
const weatherTool = tool({
description: 'Get the weather for a city.',
inputSchema: z.object({ city: z.string() }),
execute: async ({ city }) => `It is sunny in ${city}.`,
});
const { text, toolCalls } = await generateText({
model: google.interactions('gemini-2.5-flash'),
tools: { getWeather: weatherTool },
stopWhen: stepCountIs(5),
prompt: 'What is the weather in San Francisco right now?',
});
```
### Image output via Interactions
Add a `{ type: 'image' }` entry to `responseFormat` on a Gemini
image-capable model to get images as `LanguageModelV4FilePart` files in
the response. No tool wrapping is required, and the entry doubles as the
place to set `aspectRatio`, `imageSize`, and `mimeType`.
```ts
import { google } from '@ai-sdk/google';
import { generateText } from 'ai';
const result = await generateText({
model: google.interactions('gemini-3-pro-image-preview'),
prompt: 'Generate an image of a comic cat in a spaceship.',
providerOptions: {
google: {
responseFormat: [{ type: 'image' }],
},
},
});
for (const file of result.files) {
if (file.mediaType.startsWith('image/')) {
// file.uint8Array | file.base64 | file.mediaType
}
}
```
To control aspect ratio, image size, or output mime type, add those
fields to the same image entry:
```ts
const result = await generateText({
model: google.interactions('gemini-3-pro-image-preview'),
prompt: 'Generate a high-quality landscape photo of mountains at sunset.',
providerOptions: {
google: {
responseFormat: [
{
type: 'image',
aspectRatio: '16:9',
imageSize: '4K',
},
],
},
},
});
```
For multimodal output, list one entry per modality. The model returns
text in `result.text` and the accompanying image(s) in `result.files`:
```ts
import { google } from '@ai-sdk/google';
import { generateText } from 'ai';
const result = await generateText({
model: google.interactions('gemini-2.5-flash-image'),
prompt:
'Tell me a three sentence bedtime story about a unicorn, accompanied by a suitable illustration.',
providerOptions: {
google: {
responseFormat: [
{ type: 'text' },
{ type: 'image', aspectRatio: '16:9' },
],
},
},
});
console.log(result.text);
const images = result.files.filter(file => file.mediaType.startsWith('image/'));
// images[0].uint8Array | images[0].base64 | images[0].mediaType
```
Iterative image editing pairs naturally with stateful chaining — keep
`previousInteractionId` set across turns and the model edits its prior
output:
```ts
import { google } from '@ai-sdk/google';
import { generateText } from 'ai';
const model = google.interactions('gemini-3-pro-image-preview');
const turn1 = await generateText({
model,
prompt: 'Generate an image of a comic cat in a spaceship.',
providerOptions: { google: { responseFormat: [{ type: 'image' }] } },
});
const interactionId = turn1.providerMetadata?.google?.interactionId as
| string
| undefined;
const turn2 = await generateText({
model,
prompt: 'now make the cat red',
providerOptions: {
google: {
responseFormat: [{ type: 'image' }],
previousInteractionId: interactionId,
},
},
});
```
### Agent presets
Pass `{ agent: <name> }` to target a Gemini agent preset. The factory
type-checks the agent name against the supported set:
```ts
import { google } from '@ai-sdk/google';
import { generateText } from 'ai';
const result = await generateText({
model: google.interactions({
agent: 'deep-research-pro-preview-12-2025',
}),
prompt:
'Briefly summarize the most-cited papers on retrieval-augmented generation since 2024 (2-3 sentences).',
});
```
Agent calls run with `background: true` on the wire and the SDK polls the
`GET /interactions/{id}` endpoint internally until the interaction
completes. The default polling timeout is 30 minutes; raise it via
`pollingTimeoutMs` for slower agents:
```ts
await generateText({
model: google.interactions({ agent: 'deep-research-max-preview-04-2026' }),
prompt: 'Produce a long-form r