@ai-sdk/google-vertex
Version:
The **[Google Vertex provider](https://ai-sdk.dev/providers/ai-sdk-providers/google-vertex)** for the [AI SDK](https://ai-sdk.dev/docs) contains language model support for the [Google Vertex AI](https://cloud.google.com/vertex-ai) APIs.
1,395 lines (1,049 loc) • 76.9 kB
text/mdx
---
title: Google Vertex AI
description: Learn how to use the Google Vertex AI provider.
---
# Google Vertex Provider
The Google Vertex provider for the [AI SDK](/docs) contains language model support for the [Google Vertex AI](https://cloud.google.com/vertex-ai) APIs. This includes support for [Google's Gemini models](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/models), [Anthropic's Claude partner models](https://cloud.google.com/vertex-ai/generative-ai/docs/partner-models/use-claude), [xAI's Grok partner models](https://cloud.google.com/vertex-ai/generative-ai/docs/partner-models/grok), and [MaaS (Model as a Service) open models](https://cloud.google.com/vertex-ai/generative-ai/docs/maas/use-open-models).
<Note>
The Google Vertex provider is compatible with both Node.js and Edge runtimes.
The Edge runtime is supported through the `@ai-sdk/google-vertex/edge`
sub-module. More details can be found in the [Google Vertex Edge
Runtime](#google-vertex-edge-runtime), [Google Vertex Anthropic Edge
Runtime](#google-vertex-anthropic-edge-runtime), and [Google Vertex MaaS Edge
Runtime](#google-vertex-maas-edge-runtime) sections below.
</Note>
## Setup
The Google Vertex, Google Vertex Anthropic, Google Vertex xAI, and Google Vertex MaaS providers are available in the `@ai-sdk/google-vertex` module. You can install it with
<Tabs items={['pnpm', 'npm', 'yarn', 'bun']}>
<Tab>
<Snippet text="pnpm add @ai-sdk/google-vertex" dark />
</Tab>
<Tab>
<Snippet text="npm install @ai-sdk/google-vertex" dark />
</Tab>
<Tab>
<Snippet
text="yarn add @ai-sdk/google-vertex @google-cloud/vertexai"
dark
/>
</Tab>
<Tab>
<Snippet text="bun add @ai-sdk/google-vertex" dark />
</Tab>
</Tabs>
## Google Vertex Provider Usage
The Google Vertex provider instance is used to create model instances that call the Vertex AI API. The models available with this provider include [Google's Gemini models](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/models). If you're looking to use [Anthropic's Claude models](https://cloud.google.com/vertex-ai/generative-ai/docs/partner-models/use-claude), see the [Google Vertex Anthropic Provider](#google-vertex-anthropic-provider-usage) section below.
### Provider Instance
You can import the default provider instance `vertex` from `@ai-sdk/google-vertex`:
```ts
import { vertex } from '@ai-sdk/google-vertex';
```
If you need a customized setup, you can import `createVertex` from `@ai-sdk/google-vertex` and create a provider instance with your settings:
```ts
import { createVertex } from '@ai-sdk/google-vertex';
const vertex = createVertex({
project: 'my-project', // optional
location: 'us-central1', // optional
});
```
Google Vertex supports multiple authentication methods depending on your runtime environment and requirements.
#### Node.js Runtime
The Node.js runtime is the default runtime supported by the AI SDK. It supports all standard Google Cloud authentication options through the [`google-auth-library`](https://github.com/googleapis/google-auth-library-nodejs?tab=readme-ov-file#ways-to-authenticate). Typical use involves setting a path to a json credentials file in the `GOOGLE_APPLICATION_CREDENTIALS` environment variable. The credentials file can be obtained from the [Google Cloud Console](https://console.cloud.google.com/apis/credentials).
If you want to customize the Google authentication options you can pass them as options to the `createVertex` function, for example:
```ts
import { createVertex } from '@ai-sdk/google-vertex';
const vertex = createVertex({
googleAuthOptions: {
credentials: {
client_email: 'my-email',
private_key: 'my-private-key',
},
},
});
```
##### Optional Provider Settings
You can use the following optional settings to customize the provider instance:
- **project** _string_
The Google Cloud project ID that you want to use for the API calls.
It uses the `GOOGLE_VERTEX_PROJECT` environment variable by default.
- **location** _string_
The Google Cloud location that you want to use for the API calls, e.g. `us-central1`.
It uses the `GOOGLE_VERTEX_LOCATION` environment variable by default.
- **googleAuthOptions** _object_
Optional. The Authentication options used by the [Google Auth Library](https://github.com/googleapis/google-auth-library-nodejs/). See also the [GoogleAuthOptions](https://github.com/googleapis/google-auth-library-nodejs/blob/08978822e1b7b5961f0e355df51d738e012be392/src/auth/googleauth.ts#L87C18-L87C35) interface.
- **authClient** _object_
An `AuthClient` to use.
- **keyFilename** _string_
Path to a .json, .pem, or .p12 key file.
- **keyFile** _string_
Path to a .json, .pem, or .p12 key file.
- **credentials** _object_
Object containing client_email and private_key properties, or the external account client options.
- **clientOptions** _object_
Options object passed to the constructor of the client.
- **scopes** _string | string[]_
Required scopes for the desired API request.
- **projectId** _string_
Your project ID.
- **universeDomain** _string_
The default service domain for a given Cloud universe.
- **headers** _Resolvable<Record<string, string | undefined>>_
Headers to include in the requests. Can be provided in multiple formats:
- A record of header key-value pairs: `Record<string, string | undefined>`
- A function that returns headers: `() => Record<string, string | undefined>`
- An async function that returns headers: `async () => Record<string, string | undefined>`
- A promise that resolves to headers: `Promise<Record<string, string | undefined>>`
- **fetch** _(input: RequestInfo, init?: RequestInit) => Promise<Response>_
Custom [fetch](https://developer.mozilla.org/en-US/docs/Web/API/fetch) implementation.
Defaults to the global `fetch` function.
You can use it as a middleware to intercept requests,
or to provide a custom fetch implementation for e.g. testing.
- **baseURL** _string_
Optional. Base URL for the Google Vertex API calls e.g. to use proxy servers. By default, it is constructed using the location and project:
`https://${location}-aiplatform.googleapis.com/v1/projects/${project}/locations/${location}/publishers/google`
<a id="google-vertex-edge-runtime"></a>
#### Edge Runtime
Edge runtimes (like Vercel Edge Functions and Cloudflare Workers) are lightweight JavaScript environments that run closer to users at the network edge.
They only provide a subset of the standard Node.js APIs.
For example, direct file system access is not available, and many Node.js-specific libraries
(including the standard Google Auth library) are not compatible.
The Edge runtime version of the Google Vertex provider supports Google's [Application Default Credentials](https://github.com/googleapis/google-auth-library-nodejs?tab=readme-ov-file#application-default-credentials) through environment variables. The values can be obtained from a json credentials file from the [Google Cloud Console](https://console.cloud.google.com/apis/credentials).
You can import the default provider instance `vertex` from `@ai-sdk/google-vertex/edge`:
```ts
import { vertex } from '@ai-sdk/google-vertex/edge';
```
<Note>
The `/edge` sub-module is included in the `@ai-sdk/google-vertex` package, so
you don't need to install it separately. You must import from
`@ai-sdk/google-vertex/edge` to differentiate it from the Node.js provider.
</Note>
If you need a customized setup, you can import `createVertex` from `@ai-sdk/google-vertex/edge` and create a provider instance with your settings:
```ts
import { createVertex } from '@ai-sdk/google-vertex/edge';
const vertex = createVertex({
project: 'my-project', // optional
location: 'us-central1', // optional
});
```
For Edge runtime authentication, you'll need to set these environment variables from your Google Default Application Credentials JSON file:
- `GOOGLE_CLIENT_EMAIL`
- `GOOGLE_PRIVATE_KEY`
- `GOOGLE_PRIVATE_KEY_ID` (optional)
These values can be obtained from a service account JSON file from the [Google Cloud Console](https://console.cloud.google.com/apis/credentials).
##### Optional Provider Settings
You can use the following optional settings to customize the provider instance:
- **project** _string_
The Google Cloud project ID that you want to use for the API calls.
It uses the `GOOGLE_VERTEX_PROJECT` environment variable by default.
- **location** _string_
The Google Cloud location that you want to use for the API calls, e.g. `us-central1`.
It uses the `GOOGLE_VERTEX_LOCATION` environment variable by default.
- **googleCredentials** _object_
Optional. The credentials used by the Edge provider for authentication. These credentials are typically set through environment variables and are derived from a service account JSON file.
- **clientEmail** _string_
The client email from the service account JSON file. Defaults to the contents of the `GOOGLE_CLIENT_EMAIL` environment variable.
- **privateKey** _string_
The private key from the service account JSON file. Defaults to the contents of the `GOOGLE_PRIVATE_KEY` environment variable.
- **privateKeyId** _string_
The private key ID from the service account JSON file (optional). Defaults to the contents of the `GOOGLE_PRIVATE_KEY_ID` environment variable.
- **headers** _Resolvable<Record<string, string | undefined>>_
Headers to include in the requests. Can be provided in multiple formats:
- A record of header key-value pairs: `Record<string, string | undefined>`
- A function that returns headers: `() => Record<string, string | undefined>`
- An async function that returns headers: `async () => Record<string, string | undefined>`
- A promise that resolves to headers: `Promise<Record<string, string | undefined>>`
- **fetch** _(input: RequestInfo, init?: RequestInit) => Promise<Response>_
Custom [fetch](https://developer.mozilla.org/en-US/docs/Web/API/fetch) implementation.
Defaults to the global `fetch` function.
You can use it as a middleware to intercept requests,
or to provide a custom fetch implementation for e.g. testing.
#### Express Mode
Express mode provides a simplified authentication method using an API key instead of OAuth or service account credentials. When using express mode, the `project` and `location` settings are not required.
```ts
import { createVertex } from '@ai-sdk/google-vertex';
const vertex = createVertex({
apiKey: process.env.GOOGLE_VERTEX_API_KEY,
});
```
##### Optional Provider Settings
- **apiKey** _string_
The API key for Google Vertex AI. When provided, the provider uses express mode with API key authentication instead of OAuth.
It uses the `GOOGLE_VERTEX_API_KEY` environment variable by default.
### Language Models
You can create models that call the Vertex API using the provider instance.
The first argument is the model id, e.g. `gemini-2.5-pro`.
```ts
const model = vertex('gemini-2.5-pro');
```
<Note>
If you are using [your own
models](https://cloud.google.com/vertex-ai/docs/training-overview), the name
of your model needs to start with `projects/`.
</Note>
Google Vertex models support also some model specific settings that are not part
of the [standard call settings](/docs/ai-sdk-core/settings). You can pass them as
an options argument:
```ts
import { vertex } from '@ai-sdk/google-vertex';
import { type GoogleLanguageModelOptions } from '@ai-sdk/google';
const model = vertex('gemini-2.5-pro');
await generateText({
model,
providerOptions: {
vertex: {
safetySettings: [
{
category: 'HARM_CATEGORY_UNSPECIFIED',
threshold: 'BLOCK_LOW_AND_ABOVE',
},
],
} satisfies GoogleLanguageModelOptions,
},
});
```
The following optional provider options are available for Google Vertex models:
- **cachedContent** _string_
Optional. The name of the cached content used as context to serve the prediction.
Format: projects/\{project\}/locations/\{location\}/cachedContents/\{cachedContent\}
- **structuredOutputs** _boolean_
Optional. Enable structured output. Default is true.
This is useful when the JSON Schema contains elements that are
not supported by the OpenAPI schema version that
Google Vertex uses. You can use this to disable
structured outputs if you need to.
See [Troubleshooting: Schema Limitations](#schema-limitations) for more details.
- **safetySettings** _Array\<\{ category: string; threshold: string \}\>_
Optional. Safety settings for the model.
- **category** _string_
The category of the safety setting. Can be one of the following:
- `HARM_CATEGORY_UNSPECIFIED`
- `HARM_CATEGORY_HATE_SPEECH`
- `HARM_CATEGORY_DANGEROUS_CONTENT`
- `HARM_CATEGORY_HARASSMENT`
- `HARM_CATEGORY_SEXUALLY_EXPLICIT`
- `HARM_CATEGORY_CIVIC_INTEGRITY`
- **threshold** _string_
The threshold of the safety setting. Can be one of the following:
- `HARM_BLOCK_THRESHOLD_UNSPECIFIED`
- `BLOCK_LOW_AND_ABOVE`
- `BLOCK_MEDIUM_AND_ABOVE`
- `BLOCK_ONLY_HIGH`
- `BLOCK_NONE`
- **audioTimestamp** _boolean_
Optional. Enables timestamp understanding for audio files. Defaults to false.
This is useful for generating transcripts with accurate timestamps.
Consult [Google's Documentation](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/audio-understanding) for usage details.
- **labels** _object_
Optional. Defines labels used in billing reports.
Consult [Google's Documentation](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/add-labels-to-api-calls) for usage details.
- **streamFunctionCallArguments** _boolean_
Optional. When set to true, function call arguments will be streamed
incrementally in streaming responses. This enables `tool-input-delta` events
to arrive as the model generates function call arguments, reducing perceived
latency for tool calls. Defaults to `false`. Only supported on the Vertex AI API (not the Gemini API) with Gemini 3+ models.
Consult [Google's Documentation](https://docs.cloud.google.com/vertex-ai/generative-ai/docs/multimodal/function-calling#streaming-fc) for details.
- **sharedRequestType** _'priority' | 'flex' | 'standard'_
Optional. Selects a pay-as-you-go (PayGo) tier by setting the
`X-Vertex-AI-LLM-Shared-Request-Type` request header. Use `'priority'` for
consistent low-latency performance at a premium, or `'flex'` for a 50%
discount with longer expected latency. Both are supported only on the
`global` endpoint and on a subset of Gemini models.
By default — with Provisioned Throughput allocated and `requestType` unset
— the request consumes Provisioned Throughput quota first and only falls
back to the chosen shared tier if PT capacity is exhausted. To bypass
Provisioned Throughput entirely, also set `requestType: 'shared'`.
The served tier is reported back on
`result.providerMetadata.googleVertex.usageMetadata.trafficType` as
`ON_DEMAND_PRIORITY`, `ON_DEMAND_FLEX`, or (if downgraded under load) plain
`ON_DEMAND`.
See [Priority PayGo](https://docs.cloud.google.com/gemini-enterprise-agent-platform/models/priority-paygo)
and [Flex PayGo](https://docs.cloud.google.com/gemini-enterprise-agent-platform/models/flex-paygo)
for supported models, ramp limits, and downgrade behavior.
- **requestType** _'shared'_
Optional. Sets the `X-Vertex-AI-LLM-Request-Type` request header. Combine
with `sharedRequestType` to skip Provisioned Throughput entirely and route
the request through shared PayGo capacity. See
[Priority PayGo](https://docs.cloud.google.com/gemini-enterprise-agent-platform/models/priority-paygo).
You can use Google Vertex language models to generate text with the `generateText` function:
```ts highlight="1,4"
import { vertex } from '@ai-sdk/google-vertex';
import { generateText } from 'ai';
const { text } = await generateText({
model: vertex('gemini-2.5-pro'),
prompt: 'Write a vegetarian lasagna recipe for 4 people.',
});
```
Google Vertex language models can also be used in the `streamText` function
(see [AI SDK Core](/docs/ai-sdk-core)).
#### Code Execution
With [Code Execution](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/code-execution), certain Gemini models on Vertex AI can generate and execute Python code. This allows the model to perform calculations, data manipulation, and other programmatic tasks to enhance its responses.
You can enable code execution by adding the `code_execution` tool to your request.
```ts
import { vertex } from '@ai-sdk/google-vertex';
import { generateText } from 'ai';
const result = await generateText({
model: vertex('gemini-2.5-pro'),
tools: { code_execution: vertex.tools.codeExecution({}) },
prompt:
'Use python to calculate 20th fibonacci number. Then find the nearest palindrome to it.',
});
```
The response will contain `tool-call` and `tool-result` parts for the executed code.
#### URL Context
URL Context allows Gemini models to retrieve and analyze content from URLs. Supported models: Gemini 2.5 Flash-Lite, 2.5 Pro, 2.5 Flash, 2.0 Flash.
```ts
import { vertex } from '@ai-sdk/google-vertex';
import { generateText } from 'ai';
const result = await generateText({
model: vertex('gemini-2.5-pro'),
tools: { url_context: vertex.tools.urlContext({}) },
prompt: 'What are the key points from https://example.com/article?',
});
```
#### Google Search
Google Search enables Gemini models to access real-time web information. Supported models: Gemini 2.5 Flash-Lite, 2.5 Flash, 2.0 Flash, 2.5 Pro.
```ts
import { vertex } from '@ai-sdk/google-vertex';
import { generateText } from 'ai';
const result = await generateText({
model: vertex('gemini-2.5-pro'),
tools: { google_search: vertex.tools.googleSearch({}) },
prompt: 'What are the latest developments in AI?',
});
```
#### Enterprise Web Search
[Enterprise Web Search](https://cloud.google.com/vertex-ai/generative-ai/docs/grounding/web-grounding-enterprise) provides grounding using a compliance-focused web index designed for highly-regulated industries such as finance, healthcare, and the public sector. Unlike standard Google Search grounding, Enterprise Web Search does not log customer data and supports VPC service controls. Supported models: Gemini 2.0 and newer.
```ts
import { vertex } from '@ai-sdk/google-vertex';
import { generateText } from 'ai';
const result = await generateText({
model: vertex('gemini-2.5-flash'),
tools: {
enterprise_web_search: vertex.tools.enterpriseWebSearch({}),
},
prompt: 'What are the latest FDA regulations for clinical trials?',
});
```
#### Google Maps
Google Maps grounding enables Gemini models to access Google Maps data for location-aware responses. Supported models: Gemini 2.5 Flash-Lite, 2.5 Flash, 2.0 Flash, 2.5 Pro, 3.0 Pro.
```ts
import { vertex } from '@ai-sdk/google-vertex';
import { type GoogleLanguageModelOptions } from '@ai-sdk/google';
import { generateText } from 'ai';
const result = await generateText({
model: vertex('gemini-2.5-flash'),
tools: {
google_maps: vertex.tools.googleMaps({}),
},
providerOptions: {
vertex: {
retrievalConfig: {
latLng: { latitude: 34.090199, longitude: -117.881081 },
},
} satisfies GoogleLanguageModelOptions,
},
prompt: 'What are the best Italian restaurants nearby?',
});
```
The optional `retrievalConfig.latLng` provider option provides location context for queries about nearby places. This configuration applies to any grounding tools that support location context.
#### Streaming Function Call Arguments
For Gemini 3 Pro and later models on Vertex AI, you can stream function call
arguments as they are generated by setting `streamFunctionCallArguments` to
`true`. This reduces perceived latency when functions need to be called, as
`tool-input-delta` events arrive incrementally instead of waiting for the
complete arguments. This option defaults to `false`.
```ts
import { vertex } from '@ai-sdk/google-vertex';
import { type GoogleLanguageModelOptions } from '@ai-sdk/google';
import { streamText } from 'ai';
import { z } from 'zod';
const result = streamText({
model: vertex('gemini-3.1-pro-preview'),
prompt: 'What is the weather in Boston and San Francisco?',
tools: {
getWeather: {
description: 'Get the current weather in a given location',
inputSchema: z.object({
location: z.string().describe('City name'),
}),
},
},
providerOptions: {
vertex: {
streamFunctionCallArguments: true,
} satisfies GoogleLanguageModelOptions,
},
});
for await (const part of result.fullStream) {
switch (part.type) {
case 'tool-input-start':
console.log(`Tool call started: ${part.toolName}`);
break;
case 'tool-input-delta':
process.stdout.write(part.delta);
break;
case 'tool-call':
console.log(`Tool call complete: ${part.toolName}`, part.input);
break;
}
}
```
<Note>
This feature is only available on the Vertex AI API. It is not supported on
the Gemini API. When used with the Google Generative AI provider, a warning
will be emitted and the option will be ignored.
</Note>
#### Reasoning (Thinking Tokens)
Google Vertex AI, through its support for Gemini models, can also emit "thinking" tokens, representing the model's reasoning process. The AI SDK exposes these as reasoning information.
To enable thinking tokens for compatible Gemini models via Vertex, set `includeThoughts: true` in the `thinkingConfig` provider option. These options are passed through `providerOptions.vertex`:
```ts
import { vertex } from '@ai-sdk/google-vertex';
import { type GoogleLanguageModelOptions } from '@ai-sdk/google';
import { generateText, streamText } from 'ai';
// For generateText:
const { text, reasoningText, reasoning } = await generateText({
model: vertex('gemini-2.0-flash-001'), // Or other supported model via Vertex
providerOptions: {
vertex: {
thinkingConfig: {
includeThoughts: true,
// thinkingBudget: 2048, // Optional
},
} satisfies GoogleLanguageModelOptions,
},
prompt: 'Explain quantum computing in simple terms.',
});
console.log('Reasoning:', reasoningText);
console.log('Reasoning Details:', reasoning);
console.log('Final Text:', text);
// For streamText:
const result = streamText({
model: vertex('gemini-2.0-flash-001'), // Or other supported model via Vertex
providerOptions: {
vertex: {
thinkingConfig: {
includeThoughts: true,
// thinkingBudget: 2048, // Optional
},
} satisfies GoogleLanguageModelOptions,
},
prompt: 'Explain quantum computing in simple terms.',
});
for await (const part of result.fullStream) {
if (part.type === 'reasoning') {
process.stdout.write(`THOUGHT: ${part.textDelta}\n`);
} else if (part.type === 'text-delta') {
process.stdout.write(part.textDelta);
}
}
```
When `includeThoughts` is true, parts of the API response marked with `thought: true` will be processed as reasoning.
- In `generateText`, these contribute to the `reasoningText` (string) and `reasoning` (array) fields.
- In `streamText`, these are emitted as `reasoning` stream parts.
<Note>
Refer to the [Google Vertex AI documentation on
"thinking"](https://cloud.google.com/vertex-ai/generative-ai/docs/thinking)
for model compatibility and further details.
</Note>
#### File Inputs
The Google Vertex provider supports file inputs, e.g. PDF files.
```ts
import { vertex } from '@ai-sdk/google-vertex';
import { generateText } from 'ai';
const { text } = await generateText({
model: vertex('gemini-2.5-pro'),
messages: [
{
role: 'user',
content: [
{
type: 'text',
text: 'What is an embedding model according to this document?',
},
{
type: 'file',
data: fs.readFileSync('./data/ai.pdf'),
mediaType: 'application/pdf',
},
],
},
],
});
```
<Note>
The AI SDK will automatically download URLs if you pass them as data, except
for `gs://` URLs. You can use the Google Cloud Storage API to upload larger
files to that location.
</Note>
See [File Parts](/docs/foundations/prompts#file-parts) for details on how to use files in prompts.
### Cached Content
Google Vertex AI supports both explicit and implicit caching to help reduce costs on repetitive content.
#### Implicit Caching
```ts
import { vertex } from '@ai-sdk/google-vertex';
import { generateText } from 'ai';
// Structure prompts with consistent content at the beginning
const baseContext =
'You are a cooking assistant with expertise in Italian cuisine. Here are 1000 lasagna recipes for reference...';
const { text: veggieLasagna } = await generateText({
model: vertex('gemini-2.5-pro'),
prompt: `${baseContext}\n\nWrite a vegetarian lasagna recipe for 4 people.`,
});
// Second request with same prefix - eligible for cache hit
const { text: meatLasagna, providerMetadata } = await generateText({
model: vertex('gemini-2.5-pro'),
prompt: `${baseContext}\n\nWrite a meat lasagna recipe for 12 people.`,
});
// Check cached token count in usage metadata
console.log('Cached tokens:', providerMetadata.vertex);
// e.g.
// {
// groundingMetadata: null,
// safetyRatings: null,
// usageMetadata: {
// cachedContentTokenCount: 2027,
// thoughtsTokenCount: 702,
// promptTokenCount: 2152,
// candidatesTokenCount: 710,
// totalTokenCount: 3564
// }
// }
```
#### Explicit Caching
You can use explicit caching with Gemini models. See the [Vertex AI context caching documentation](https://cloud.google.com/vertex-ai/generative-ai/docs/context-cache/context-cache-overview) to check if caching is supported for your model.
First, create a cache using the Google GenAI SDK with Vertex mode enabled:
```ts
import { GoogleGenAI } from '@google/genai';
const ai = new GoogleGenAI({
vertexai: true,
project: process.env.GOOGLE_VERTEX_PROJECT,
location: process.env.GOOGLE_VERTEX_LOCATION,
});
const model = 'gemini-2.5-pro';
// Create a cache with the content you want to reuse
const cache = await ai.caches.create({
model,
config: {
contents: [
{
role: 'user',
parts: [{ text: '1000 Lasagna Recipes...' }],
},
],
ttl: '300s', // Cache expires after 5 minutes
},
});
console.log('Cache created:', cache.name);
// e.g. projects/my-project/locations/us-central1/cachedContents/abc123
```
Then use the cache with the AI SDK:
```ts
import { vertex } from '@ai-sdk/google-vertex';
import { type GoogleLanguageModelOptions } from '@ai-sdk/google';
import { generateText } from 'ai';
const { text: veggieLasagnaRecipe } = await generateText({
model: vertex('gemini-2.5-pro'),
prompt: 'Write a vegetarian lasagna recipe for 4 people.',
providerOptions: {
vertex: {
cachedContent: cache.name,
} satisfies GoogleLanguageModelOptions,
},
});
const { text: meatLasagnaRecipe } = await generateText({
model: vertex('gemini-2.5-pro'),
prompt: 'Write a meat lasagna recipe for 12 people.',
providerOptions: {
vertex: {
cachedContent: cache.name,
} satisfies GoogleLanguageModelOptions,
},
});
```
### Safety Ratings
The safety ratings provide insight into the safety of the model's response.
See [Google Vertex AI documentation on configuring safety filters](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/configure-safety-filters).
Example response excerpt:
```json
{
"safetyRatings": [
{
"category": "HARM_CATEGORY_HATE_SPEECH",
"probability": "NEGLIGIBLE",
"probabilityScore": 0.11027937,
"severity": "HARM_SEVERITY_LOW",
"severityScore": 0.28487435
},
{
"category": "HARM_CATEGORY_DANGEROUS_CONTENT",
"probability": "HIGH",
"blocked": true,
"probabilityScore": 0.95422274,
"severity": "HARM_SEVERITY_MEDIUM",
"severityScore": 0.43398145
},
{
"category": "HARM_CATEGORY_HARASSMENT",
"probability": "NEGLIGIBLE",
"probabilityScore": 0.11085559,
"severity": "HARM_SEVERITY_NEGLIGIBLE",
"severityScore": 0.19027223
},
{
"category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
"probability": "NEGLIGIBLE",
"probabilityScore": 0.22901751,
"severity": "HARM_SEVERITY_NEGLIGIBLE",
"severityScore": 0.09089675
}
]
}
```
For more details, see the [Google Vertex AI documentation on grounding with Google Search](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/ground-gemini#ground-to-search).
### Troubleshooting
#### Schema Limitations
The Google Vertex API uses a subset of the OpenAPI 3.0 schema,
which does not support features such as unions.
The errors that you get in this case look like this:
`GenerateContentRequest.generation_config.response_schema.properties[occupation].type: must be specified`
By default, structured outputs are enabled (and for tool calling they are required).
You can disable structured outputs for object generation as a workaround:
```ts highlight="7,12"
import { vertex } from '@ai-sdk/google-vertex';
import { type GoogleLanguageModelOptions } from '@ai-sdk/google';
import { generateText, Output } from 'ai';
const result = await generateText({
model: vertex('gemini-2.5-pro'),
providerOptions: {
vertex: {
structuredOutputs: false,
} satisfies GoogleLanguageModelOptions,
},
output: Output.object({
schema: z.object({
name: z.string(),
age: z.number(),
contact: z.union([
z.object({
type: z.literal('email'),
value: z.string(),
}),
z.object({
type: z.literal('phone'),
value: z.string(),
}),
]),
}),
}),
prompt: 'Generate an example person for testing.',
});
```
The following Zod features are known to not work with Google Vertex:
- `z.union`
- `z.record`
### Model Capabilities
| Model | Image Input | Object Generation | Tool Usage | Tool Streaming |
| ---------------------- | ------------------- | ------------------- | ------------------- | ------------------- |
| `gemini-3.5-flash` | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> |
| `gemini-3-pro-preview` | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> |
| `gemini-2.5-pro` | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> |
| `gemini-2.5-flash` | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> |
| `gemini-2.0-flash-001` | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> |
<Note>
The table above lists popular models. Please see the [Google Vertex AI
docs](https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/inference#supported-models)
for a full list of available models. The table above lists popular models. You
can also pass any available provider model ID as a string if needed.
</Note>
### Embedding Models
You can create models that call the Google Vertex AI embeddings API using the `.embeddingModel()` factory method:
```ts
const model = vertex.embeddingModel('text-embedding-005');
```
Google Vertex AI embedding models support additional settings. You can pass them as an options argument:
```ts
import {
vertex,
type GoogleVertexEmbeddingModelOptions,
} from '@ai-sdk/google-vertex';
import { embed } from 'ai';
const model = vertex.embeddingModel('text-embedding-005');
const { embedding } = await embed({
model,
value: 'sunny day at the beach',
providerOptions: {
vertex: {
outputDimensionality: 512, // optional, number of dimensions for the embedding
taskType: 'SEMANTIC_SIMILARITY', // optional, specifies the task type for generating embeddings
autoTruncate: false, // optional
} satisfies GoogleVertexEmbeddingModelOptions,
},
});
```
The following optional provider options are available for Google Vertex AI embedding models:
- **outputDimensionality**: _number_
Optional reduced dimension for the output embedding. If set, excessive values in the output embedding are truncated from the end.
- **taskType**: _string_
Optional. Specifies the task type for generating embeddings. Supported task types include:
- `SEMANTIC_SIMILARITY`: Optimized for text similarity.
- `CLASSIFICATION`: Optimized for text classification.
- `CLUSTERING`: Optimized for clustering texts based on similarity.
- `RETRIEVAL_DOCUMENT`: Optimized for document retrieval.
- `RETRIEVAL_QUERY`: Optimized for query-based retrieval.
- `QUESTION_ANSWERING`: Optimized for answering questions.
- `FACT_VERIFICATION`: Optimized for verifying factual information.
- `CODE_RETRIEVAL_QUERY`: Optimized for retrieving code blocks based on natural language queries.
- **title**: _string_
Optional. The title of the document being embedded. This helps the model produce better embeddings by providing additional context. Only valid when `taskType` is set to `'RETRIEVAL_DOCUMENT'`.
- **autoTruncate**: _boolean_
Optional. When set to `true`, input text will be truncated if it exceeds the maximum length. When set to `false`, an error is returned if the input text is too long. Defaults to `true`.
#### Model Capabilities
| Model | Max Values Per Call | Parallel Calls | Multimodal |
| ---------------------------- | ------------------- | ------------------- | ------------------- |
| `text-embedding-005` | 2048 | <Check size={18} /> | <Cross size={18} /> |
| `gemini-embedding-2-preview` | 2048 | <Check size={18} /> | <Check size={18} /> |
<Note>
The table above lists popular models. You can also pass any available provider
model ID as a string if needed.
</Note>
### Image Models
You can create image models using the `.image()` factory method. The Google Vertex provider supports both [Imagen](https://cloud.google.com/vertex-ai/generative-ai/docs/image/overview) and [Gemini image models](https://cloud.google.com/vertex-ai/generative-ai/docs/models/gemini/2-5-flash-image). For more on image generation with the AI SDK see [generateImage()](/docs/reference/ai-sdk-core/generate-image).
#### Imagen Models
[Imagen models](https://cloud.google.com/vertex-ai/generative-ai/docs/image/generate-images) generate images using the Imagen on Vertex AI API.
```ts
import { vertex } from '@ai-sdk/google-vertex';
import { generateImage } from 'ai';
const { image } = await generateImage({
model: vertex.image('imagen-4.0-generate-001'),
prompt: 'A futuristic cityscape at sunset',
aspectRatio: '16:9',
});
```
Further configuration can be done using Google Vertex provider options. You can validate the provider options using the `GoogleVertexImageModelOptions` type.
```ts
import { vertex } from '@ai-sdk/google-vertex';
import { GoogleVertexImageModelOptions } from '@ai-sdk/google-vertex';
import { generateImage } from 'ai';
const { image } = await generateImage({
model: vertex.image('imagen-4.0-generate-001'),
providerOptions: {
vertex: {
negativePrompt: 'pixelated, blurry, low-quality',
} satisfies GoogleVertexImageModelOptions,
},
// ...
});
```
The following provider options are available:
- **negativePrompt** _string_
A description of what to discourage in the generated images.
- **personGeneration** `allow_adult` | `allow_all` | `dont_allow`
Whether to allow person generation. Defaults to `allow_adult`.
- **safetySetting** `block_low_and_above` | `block_medium_and_above` | `block_only_high` | `block_none`
Whether to block unsafe content. Defaults to `block_medium_and_above`.
- **addWatermark** _boolean_
Whether to add an invisible watermark to the generated images. Defaults to `true`.
- **storageUri** _string_
Cloud Storage URI to store the generated images.
<Note>
Imagen models do not support the `size` parameter. Use the `aspectRatio`
parameter instead.
</Note>
Additional information about the images can be retrieved using Google Vertex meta data.
```ts
import { vertex } from '@ai-sdk/google-vertex';
import { GoogleVertexImageModelOptions } from '@ai-sdk/google-vertex';
import { generateImage } from 'ai';
const { image, providerMetadata } = await generateImage({
model: vertex.image('imagen-4.0-generate-001'),
prompt: 'A futuristic cityscape at sunset',
aspectRatio: '16:9',
});
console.log(
`Revised prompt: ${providerMetadata.vertex.images[0].revisedPrompt}`,
);
```
##### Image Editing
Google Vertex Imagen models support image editing through inpainting, outpainting, and other edit modes. Pass input images via `prompt.images` and optionally a mask via `prompt.mask`.
<Note>
Image editing is supported by `imagen-3.0-capability-001`. The
`imagen-4.0-generate-001` model does not currently support editing operations.
</Note>
###### Inpainting (Insert Objects)
Insert or replace objects in specific areas using a mask:
```ts
import { vertex, GoogleVertexImageModelOptions } from '@ai-sdk/google-vertex';
import { generateImage } from 'ai';
import fs from 'fs';
const image = fs.readFileSync('./input-image.png');
const mask = fs.readFileSync('./mask.png'); // White = edit area
const { images } = await generateImage({
model: vertex.image('imagen-3.0-capability-001'),
prompt: {
text: 'A sunlit indoor lounge area with a pool containing a flamingo',
images: [image],
mask,
},
providerOptions: {
vertex: {
edit: {
baseSteps: 50,
mode: 'EDIT_MODE_INPAINT_INSERTION',
maskMode: 'MASK_MODE_USER_PROVIDED',
maskDilation: 0.01,
},
} satisfies GoogleVertexImageModelOptions,
},
});
```
###### Outpainting (Extend Image)
Extend an image beyond its original boundaries:
```ts
import { vertex, GoogleVertexImageModelOptions } from '@ai-sdk/google-vertex';
import { generateImage } from 'ai';
import fs from 'fs';
const image = fs.readFileSync('./input-image.png');
const mask = fs.readFileSync('./outpaint-mask.png'); // White = extend area
const { images } = await generateImage({
model: vertex.image('imagen-3.0-capability-001'),
prompt: {
text: 'Extend the scene with more of the forest background',
images: [image],
mask,
},
providerOptions: {
vertex: {
edit: {
baseSteps: 50,
mode: 'EDIT_MODE_OUTPAINT',
maskMode: 'MASK_MODE_USER_PROVIDED',
},
} satisfies GoogleVertexImageModelOptions,
},
});
```
###### Edit Provider Options
The following options are available under `providerOptions.vertex.edit`:
- **mode** - The edit mode to use:
- `EDIT_MODE_INPAINT_INSERTION` - Insert objects into masked areas
- `EDIT_MODE_INPAINT_REMOVAL` - Remove objects from masked areas
- `EDIT_MODE_OUTPAINT` - Extend image beyond boundaries
- `EDIT_MODE_CONTROLLED_EDITING` - Controlled editing
- `EDIT_MODE_PRODUCT_IMAGE` - Product image editing
- `EDIT_MODE_BGSWAP` - Background swap
- **baseSteps** _number_ - Number of sampling steps (35-75). Higher values = better quality but slower.
- **maskMode** - How to interpret the mask:
- `MASK_MODE_USER_PROVIDED` - Use the provided mask directly
- `MASK_MODE_DEFAULT` - Default mask mode
- `MASK_MODE_DETECTION_BOX` - Mask from detected bounding boxes
- `MASK_MODE_CLOTHING_AREA` - Mask from clothing segmentation
- `MASK_MODE_PARSED_PERSON` - Mask from person parsing
- **maskDilation** _number_ - Percentage (0-1) to grow the mask. Recommended: 0.01.
<Note>
Input images must be provided as `Buffer`, `ArrayBuffer`, `Uint8Array`, or
base64-encoded strings. URL-based images are not supported for Google Vertex
image editing.
</Note>
##### Imagen Model Capabilities
| Model | Aspect Ratios |
| ------------------------------- | ------------------------- |
| `imagen-3.0-generate-001` | 1:1, 3:4, 4:3, 9:16, 16:9 |
| `imagen-3.0-generate-002` | 1:1, 3:4, 4:3, 9:16, 16:9 |
| `imagen-3.0-fast-generate-001` | 1:1, 3:4, 4:3, 9:16, 16:9 |
| `imagen-4.0-generate-001` | 1:1, 3:4, 4:3, 9:16, 16:9 |
| `imagen-4.0-fast-generate-001` | 1:1, 3:4, 4:3, 9:16, 16:9 |
| `imagen-4.0-ultra-generate-001` | 1:1, 3:4, 4:3, 9:16, 16:9 |
#### Gemini Image Models
[Gemini image models](https://cloud.google.com/vertex-ai/generative-ai/docs/models/gemini/2-5-flash-image) (e.g. `gemini-2.5-flash-image`) are multimodal output language models that can be used with `generateImage()` for a simpler image generation experience. Internally, the provider calls the language model API with `responseModalities: ['IMAGE']`.
```ts
import { vertex } from '@ai-sdk/google-vertex';
import { generateImage } from 'ai';
const { image } = await generateImage({
model: vertex.image('gemini-2.5-flash-image'),
prompt: 'A photorealistic image of a cat wearing a wizard hat',
aspectRatio: '1:1',
});
```
Gemini image models also support image editing by providing input images:
```ts
import { vertex } from '@ai-sdk/google-vertex';
import { generateImage } from 'ai';
import fs from 'node:fs';
const sourceImage = fs.readFileSync('./cat.png');
const { image } = await generateImage({
model: vertex.image('gemini-2.5-flash-image'),
prompt: {
text: 'Add a small wizard hat to this cat',
images: [sourceImage],
},
});
```
You can also use URLs (including `gs://` Cloud Storage URIs) for input images:
```ts
import { vertex } from '@ai-sdk/google-vertex';
import { generateImage } from 'ai';
const { image } = await generateImage({
model: vertex.image('gemini-2.5-flash-image'),
prompt: {
text: 'Add a small wizard hat to this cat',
images: ['https://example.com/cat.png'],
},
});
```
<Note>
Gemini image models do not support the `size` or `n` parameters. Use
`aspectRatio` instead of `size`. Mask-based inpainting is also not supported.
</Note>
<Note>
Gemini image models are multimodal output models that can generate both text
and images. For more advanced use cases where you need both text and image
outputs, or want more control over the generation process, you can use them
directly with `generateText()`.
</Note>
##### Gemini Image Model Capabilities
| Model | Image Generation | Image Editing | Aspect Ratios |
| -------------------------------- | ------------------- | ------------------- | --------------------------------------------------- |
| `gemini-3.1-flash-image-preview` | <Check size={18} /> | <Check size={18} /> | 1:1, 2:3, 3:2, 3:4, 4:3, 4:5, 5:4, 9:16, 16:9, 21:9 |
| `gemini-3-pro-image-preview` | <Check size={18} /> | <Check size={18} /> | 1:1, 2:3, 3:2, 3:4, 4:3, 4:5, 5:4, 9:16, 16:9, 21:9 |
| `gemini-2.5-flash-image` | <Check size={18} /> | <Check size={18} /> | 1:1, 2:3, 3:2, 3:4, 4:3, 4:5, 5:4, 9:16, 16:9, 21:9 |
<Note>
`gemini-3-pro-image-preview` supports additional features including up to 14
reference images for editing (6 objects, 5 humans), resolution options (1K,
2K, 4K via `providerOptions.vertex.imageConfig.imageSize`), and Google Search
grounding.
</Note>
### Video Models
You can create [Veo](https://cloud.google.com/vertex-ai/generative-ai/docs/video/overview) video models that call the Vertex AI API
using the `.video()` factory method. For more on video generation with the AI SDK see [generateVideo()](/docs/reference/ai-sdk-core/generate-video).
```ts
import { vertex } from '@ai-sdk/google-vertex';
import { experimental_generateVideo as generateVideo } from 'ai';
const { video } = await generateVideo({
model: vertex.video('veo-3.1-generate-001'),
prompt:
'A pangolin curled on a mossy stone in a glowing bioluminescent forest',
aspectRatio: '16:9',
});
```
You can configure resolution and duration:
```ts
import { vertex } from '@ai-sdk/google-vertex';
import { experimental_generateVideo as generateVideo } from 'ai';
const { video } = await generateVideo({
model: vertex.video('veo-3.1-generate-001'),
prompt: 'A serene mountain landscape at sunset',
aspectRatio: '16:9',
resolution: '1920x1080',
duration: 8,
});
```
#### Provider Options
Further configuration can be done using Google Vertex provider options. You can validate the provider options using the `GoogleVertexVideoModelOptions` type.
```ts
import { vertex } from '@ai-sdk/google-vertex';
import { GoogleVertexVideoModelOptions } from '@ai-sdk/google-vertex';
import { experimental_generateVideo as generateVideo } from 'ai';
const { video } = await generateVideo({
model: vertex.video('veo-3.1-generate-001'),
prompt: 'A serene mountain landscape at sunset',
aspectRatio: '16:9',
providerOptions: {
vertex: {
generateAudio: true,
personGeneration: 'allow_adult',
} satisfies GoogleVertexVideoModelOptions,
},
});
```
The following provider options are available:
- **generateAudio** _boolean_
Whether to generate audio along with the video.
- **personGeneration** `'dont_allow'` | `'allow_adult'` | `'allow_all'`
Whether to allow person generation in the video.
- **negativePrompt** _string_
A description of what to discourage in the generated video.
- **gcsOutputDirectory** _string_
Cloud Storage URI to store the generated videos.
- **referenceImages** _Array\<\{ bytesBase64Encoded?: string; gcsUri?: string \}\>_
Reference images for style or asset guidance.
- **pollIntervalMs** _number_
Polling interval in milliseconds for checking task status.
- **pollTimeoutMs** _number_
Maximum wait time in milliseconds for video generation.
<Note>
Video generation is an asynchronous process that can take several minutes. For
longer videos or higher resolutions, consider setting `pollTimeoutMs` to at
least 10 minutes (600000ms).
</Note>
#### Model Capabilities
| Model | Audio Support |
| --------------------------- | ------------- |
| `veo-3.1-generate-001` | Yes |
| `veo-3.1-fast-generate-001` | Yes |
| `veo-3.0-generate-001` | Yes |
| `veo-3.0-fast-generate-001` | Yes |
| `veo-2.0-generate-001` | No |
<Note>
The table above lists popular models. You can also pass any available provider
model ID as a string if needed.
</Note>
## Google Vertex Anthropic Provider Usage
The Google Vertex Anthropic provider for the [AI SDK](/docs) offers support for Anthropic's Claude models through the Google Vertex AI APIs. This section provides details on how to set up and use the Google Vertex Anthropic provider.
### Provider Instance
You can import the default provider instance `vertexAnthropic` from `@ai-sdk/google-vertex/anthropic`:
```typescript
import { vertexAnthropic } from '@ai-sdk/google-vertex/anthropic';
```
If you need a customized setup, you can import `createVertexAnthropic` from `@ai-sdk/google-vertex/anthropic` and create a provider instance with your settings:
```typescript
import { createVertexAnthropic } from '@ai-sdk/google-vertex/anthropic';
const vertexAnthropic = createVertexAnthropic({
project: 'my-project', // optional
location: 'us-central1', // optional
});
```
#### Node.js Runtime
For Node.js environments, the Google Vertex Anthropic provider supports all standard Google Cloud authentication options through the `google-auth-library`. You can customize the authentication options by passing them to the `createVertexAnthropic` function:
```typescript
import { createVertexAnthropic } from '@ai-sdk/google-vertex/anthropic';
const vertexAnthropic = createVertexAnthropic({
googleAuthOptions: {
credentials: {
client_email: 'my-email',
private_key: 'my-private-key',
},
},
});
```
##### Optional Provider Settings
You can use the following optional settings to customize the Google Vertex Anthropic provider instance:
- **project** _string_
The Google Cloud project ID that you want to use for the API calls.
It uses the `GOOGLE_VERTEX_PROJECT` environment variable by default.
- **location** _string_
The Google Cloud location that you want to use for the API calls, e.g. `us-central1`.
It uses the `GOOGLE_VERTEX_LOCATION` environment variable by default.
- **googleAuthOptions** _object_
Optional. The Authentication options used by the [Google Auth Library](https://github.com/googleapis/google-auth-library-nodejs/). See also the [GoogleAuthOptions](https://github.com/googleapis/google-auth-library-nodejs/blob/08978822e1b7b5961f0e355df51d738e012be392/src/auth/googleauth.ts#L87C18-L87C35) interface.
- **authClient** _object_
An `AuthClient` to use.
- **keyFilename** _string_
Path to a .json, .pem, or .p12 key file.
- **keyFile** _string_
Path to a .json, .pem, or .p12 key file.
- **credentials** _object_
Object containing client_email and private_key properties, or the external account client options.
- **clientOptions** _object_
Options object passed to the constructor of the client.
- **scopes** _string | string[]_
Required scopes for the desired API request.
- **projectId** _string_
Your project ID.
- **universeDomain** _string_
The default service domain for a given Cloud universe.
- **headers** _Resolvable<Record<string, string | undefined>>_
Headers to include in the requests. Can be provided in multiple formats:
- A record of header key-value pairs: `Record<string, string | undefined>`
- A function that returns headers: `() => Record<string, string | undefined>`
- An async function that returns headers: `async () => Record<string, string | undefined>`
- A promise that