ai
Version:
AI SDK by Vercel - build apps like ChatGPT, Claude, Gemini, and more with a single interface for any model using the Vercel AI Gateway or go direct to OpenAI, Anthropic, Google, or any other model provider.
342 lines (268 loc) • 25.6 kB
text/mdx
---
title: Image Generation
description: Learn how to generate images with the AI SDK.
---
# Image Generation
The AI SDK provides the [`generateImage`](/docs/reference/ai-sdk-core/generate-image)
function to generate images based on a given prompt using an image model.
```tsx
import { generateImage } from 'ai';
__PROVIDER_IMPORT__;
const { image } = await generateImage({
model: __IMAGE_MODEL__,
prompt: 'Santa Claus driving a Cadillac',
});
```
You can access the image data using the `base64` or `uint8Array` properties:
```tsx
const base64 = image.base64; // base64 image data
const uint8Array = image.uint8Array; // Uint8Array image data
```
## Settings
### Size and Aspect Ratio
Depending on the model, you can either specify the size or the aspect ratio.
##### Size
The size is specified as a string in the format `{width}x{height}`.
Models only support a few sizes, and the supported sizes are different for each model and provider.
```tsx highlight={"7"}
import { generateImage } from 'ai';
__PROVIDER_IMPORT__;
const { image } = await generateImage({
model: __IMAGE_MODEL__,
prompt: 'Santa Claus driving a Cadillac',
size: '1024x1024',
});
```
##### Aspect Ratio
The aspect ratio is specified as a string in the format `{width}:{height}`.
Models only support a few aspect ratios, and the supported aspect ratios are different for each model and provider.
```tsx highlight={"7"}
import { generateImage } from 'ai';
__PROVIDER_IMPORT__;
const { image } = await generateImage({
model: __IMAGE_MODEL__,
prompt: 'Santa Claus driving a Cadillac',
aspectRatio: '16:9',
});
```
### Generating Multiple Images
`generateImage` also supports generating multiple images at once:
```tsx highlight={"7"}
import { generateImage } from 'ai';
__PROVIDER_IMPORT__;
const { images } = await generateImage({
model: __IMAGE_MODEL__,
prompt: 'Santa Claus driving a Cadillac',
n: 4, // number of images to generate
});
```
<Note>
`generateImage` will automatically call the model as often as needed (in
parallel) to generate the requested number of images.
</Note>
Each image model has an internal limit on how many images it can generate in a single API call. The AI SDK manages this automatically by batching requests appropriately when you request multiple images using the `n` parameter. By default, the SDK uses provider-documented limits (for example, DALL-E 3 can only generate 1 image per call, while DALL-E 2 supports up to 10).
If needed, you can override this behavior using the `maxImagesPerCall` setting when generating your image. This is particularly useful when working with new or custom models where the default batch size might not be optimal:
```tsx
const { images } = await generateImage({
model: __IMAGE_MODEL__,
prompt: 'Santa Claus driving a Cadillac',
maxImagesPerCall: 5, // Override the default batch size
n: 10, // Will make 2 calls of 5 images each
});
```
### Providing a Seed
You can provide a seed to the `generateImage` function to control the output of the image generation process.
If supported by the model, the same seed will always produce the same image.
```tsx highlight={"7"}
import { generateImage } from 'ai';
__PROVIDER_IMPORT__;
const { image } = await generateImage({
model: __IMAGE_MODEL__,
prompt: 'Santa Claus driving a Cadillac',
seed: 1234567890,
});
```
### Provider-specific Settings
Image models often have provider- or even model-specific settings.
You can pass such settings to the `generateImage` function
using the `providerOptions` parameter. The options for the provider
(`openai` in the example below) become request body properties.
```tsx highlight={"9"}
import { generateImage } from 'ai';
import { openai } from '@ai-sdk/openai';
const { image } = await generateImage({
model: openai.image('dall-e-3'),
prompt: 'Santa Claus driving a Cadillac',
size: '1024x1024',
providerOptions: {
openai: { style: 'vivid', quality: 'hd' },
},
});
```
### Abort Signals and Timeouts
`generateImage` accepts an optional `abortSignal` parameter of
type [`AbortSignal`](https://developer.mozilla.org/en-US/docs/Web/API/AbortSignal)
that you can use to abort the image generation process or set a timeout.
```ts highlight={"7"}
import { generateImage } from 'ai';
__PROVIDER_IMPORT__;
const { image } = await generateImage({
model: __IMAGE_MODEL__,
prompt: 'Santa Claus driving a Cadillac',
abortSignal: AbortSignal.timeout(1000), // Abort after 1 second
});
```
### Custom Headers
`generateImage` accepts an optional `headers` parameter of type `Record<string, string>`
that you can use to add custom headers to the image generation request.
```ts highlight={"7"}
import { generateImage } from 'ai';
__PROVIDER_IMPORT__;
const { image } = await generateImage({
model: __IMAGE_MODEL__,
prompt: 'Santa Claus driving a Cadillac',
headers: { 'X-Custom-Header': 'custom-value' },
});
```
### Warnings
If the model returns warnings, e.g. for unsupported parameters, they will be available in the `warnings` property of the response.
```tsx
const { image, warnings } = await generateImage({
model: __IMAGE_MODEL__,
prompt: 'Santa Claus driving a Cadillac',
});
```
### Additional provider-specific meta data
Some providers expose additional meta data for the result overall or per image.
```tsx
const prompt = 'Santa Claus driving a Cadillac';
const { image, providerMetadata } = await generateImage({
model: openai.image('dall-e-3'),
prompt,
});
const revisedPrompt = providerMetadata.openai.images[0]?.revisedPrompt;
console.log({
prompt,
revisedPrompt,
});
```
The outer key of the returned `providerMetadata` is the provider name. The inner values are the metadata. An `images` key is always present in the metadata and is an array with the same length as the top level `images` key.
### Error Handling
When `generateImage` cannot generate a valid image, it throws a [`AI_NoImageGeneratedError`](/docs/reference/ai-sdk-errors/ai-no-image-generated-error).
This error occurs when the AI provider fails to generate an image. It can arise due to the following reasons:
- The model failed to generate a response
- The model generated a response that could not be parsed
The error preserves the following information to help you log the issue:
- `responses`: Metadata about the image model responses, including timestamp, model, and headers.
- `cause`: The cause of the error. You can use this for more detailed error handling
```ts
import { generateImage, NoImageGeneratedError } from 'ai';
try {
await generateImage({ model, prompt });
} catch (error) {
if (NoImageGeneratedError.isInstance(error)) {
console.log('NoImageGeneratedError');
console.log('Cause:', error.cause);
console.log('Responses:', error.responses);
}
}
```
## Image Middleware
You can enhance image models, e.g. to set default values or implement logging, using
`wrapImageModel` and `ImageModelV3Middleware`.
Here is an example that sets a default size when none is provided:
```ts
import { generateImage, wrapImageModel } from 'ai';
__PROVIDER_IMPORT__;
const model = wrapImageModel({
model: __IMAGE_MODEL__,
middleware: {
specificationVersion: 'v3',
transformParams: async ({ params }) => ({
...params,
size: params.size ?? '1024x1024',
}),
},
});
const { image } = await generateImage({
model,
prompt: 'Santa Claus driving a Cadillac',
});
```
## Generating Images with Language Models
Some language models such as Google `gemini-2.5-flash-image` support multi-modal outputs including images.
With such models, you can access the generated images using the `files` property of the response.
```ts
import { google } from '@ai-sdk/google';
import { generateText } from 'ai';
const result = await generateText({
model: google('gemini-2.5-flash-image'),
prompt: 'Generate an image of a comic cat',
});
for (const file of result.files) {
if (file.mediaType.startsWith('image/')) {
// The file object provides multiple data formats:
// Access images as base64 string, Uint8Array binary data, or check type
// - file.base64: string (data URL format)
// - file.uint8Array: Uint8Array (binary data)
// - file.mediaType: string (e.g. "image/png")
}
}
```
## Image Models
| Provider | Model | Support sizes (`width x height`) or aspect ratios (`width : height`) |
| ------------------------------------------------------------------------------- | ------------------------------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| [xAI Grok](/providers/ai-sdk-providers/xai#image-models) | `grok-imagine-image` | `1:1`, `16:9`, `9:16`, `4:3`, `3:4`, `3:2`, `2:3`, `2:1`, `1:2`, `19.5:9`, `9:19.5`, `20:9`, `9:20`, `auto` |
| [OpenAI](/providers/ai-sdk-providers/openai#image-models) | `gpt-image-1` | 1024x1024, 1536x1024, 1024x1536 |
| [OpenAI](/providers/ai-sdk-providers/openai#image-models) | `dall-e-3` | 1024x1024, 1792x1024, 1024x1792 |
| [OpenAI](/providers/ai-sdk-providers/openai#image-models) | `dall-e-2` | 256x256, 512x512, 1024x1024 |
| [Amazon Bedrock](/providers/ai-sdk-providers/amazon-bedrock#image-models) | `amazon.nova-canvas-v1:0` | 320-4096 (multiples of 16), 1:4 to 4:1, max 4.2M pixels |
| [Fal](/providers/ai-sdk-providers/fal#image-models) | `fal-ai/flux/dev` | 1:1, 3:4, 4:3, 9:16, 16:9, 9:21, 21:9 |
| [Fal](/providers/ai-sdk-providers/fal#image-models) | `fal-ai/flux-lora` | 1:1, 3:4, 4:3, 9:16, 16:9, 9:21, 21:9 |
| [Fal](/providers/ai-sdk-providers/fal#image-models) | `fal-ai/fast-sdxl` | 1:1, 3:4, 4:3, 9:16, 16:9, 9:21, 21:9 |
| [Fal](/providers/ai-sdk-providers/fal#image-models) | `fal-ai/flux-pro/v1.1-ultra` | 1:1, 3:4, 4:3, 9:16, 16:9, 9:21, 21:9 |
| [Fal](/providers/ai-sdk-providers/fal#image-models) | `fal-ai/ideogram/v2` | 1:1, 3:4, 4:3, 9:16, 16:9, 9:21, 21:9 |
| [Fal](/providers/ai-sdk-providers/fal#image-models) | `fal-ai/recraft-v3` | 1:1, 3:4, 4:3, 9:16, 16:9, 9:21, 21:9 |
| [Fal](/providers/ai-sdk-providers/fal#image-models) | `fal-ai/stable-diffusion-3.5-large` | 1:1, 3:4, 4:3, 9:16, 16:9, 9:21, 21:9 |
| [Fal](/providers/ai-sdk-providers/fal#image-models) | `fal-ai/hyper-sdxl` | 1:1, 3:4, 4:3, 9:16, 16:9, 9:21, 21:9 |
| [DeepInfra](/providers/ai-sdk-providers/deepinfra#image-models) | `stabilityai/sd3.5` | 1:1, 16:9, 1:9, 3:2, 2:3, 4:5, 5:4, 9:16, 9:21 |
| [DeepInfra](/providers/ai-sdk-providers/deepinfra#image-models) | `black-forest-labs/FLUX-1.1-pro` | 256-1440 (multiples of 32) |
| [DeepInfra](/providers/ai-sdk-providers/deepinfra#image-models) | `black-forest-labs/FLUX-1-schnell` | 256-1440 (multiples of 32) |
| [DeepInfra](/providers/ai-sdk-providers/deepinfra#image-models) | `black-forest-labs/FLUX-1-dev` | 256-1440 (multiples of 32) |
| [DeepInfra](/providers/ai-sdk-providers/deepinfra#image-models) | `black-forest-labs/FLUX-pro` | 256-1440 (multiples of 32) |
| [DeepInfra](/providers/ai-sdk-providers/deepinfra#image-models) | `stabilityai/sd3.5-medium` | 1:1, 16:9, 1:9, 3:2, 2:3, 4:5, 5:4, 9:16, 9:21 |
| [DeepInfra](/providers/ai-sdk-providers/deepinfra#image-models) | `stabilityai/sdxl-turbo` | 1:1, 16:9, 1:9, 3:2, 2:3, 4:5, 5:4, 9:16, 9:21 |
| [Replicate](/providers/ai-sdk-providers/replicate) | `black-forest-labs/flux-schnell` | 1:1, 2:3, 3:2, 4:5, 5:4, 16:9, 9:16, 9:21, 21:9 |
| [Replicate](/providers/ai-sdk-providers/replicate) | `recraft-ai/recraft-v3` | 1024x1024, 1365x1024, 1024x1365, 1536x1024, 1024x1536, 1820x1024, 1024x1820, 1024x2048, 2048x1024, 1434x1024, 1024x1434, 1024x1280, 1280x1024, 1024x1707, 1707x1024 |
| [Google](/providers/ai-sdk-providers/google-generative-ai#image-models) | `imagen-4.0-generate-001` | 1:1, 3:4, 4:3, 9:16, 16:9 |
| [Google](/providers/ai-sdk-providers/google-generative-ai#image-models) | `imagen-4.0-fast-generate-001` | 1:1, 3:4, 4:3, 9:16, 16:9 |
| [Google](/providers/ai-sdk-providers/google-generative-ai#image-models) | `imagen-4.0-ultra-generate-001` | 1:1, 3:4, 4:3, 9:16, 16:9 |
| [Google Vertex](/providers/ai-sdk-providers/google-vertex#image-models) | `imagen-4.0-generate-001` | 1:1, 3:4, 4:3, 9:16, 16:9 |
| [Google Vertex](/providers/ai-sdk-providers/google-vertex#image-models) | `imagen-4.0-fast-generate-001` | 1:1, 3:4, 4:3, 9:16, 16:9 |
| [Google Vertex](/providers/ai-sdk-providers/google-vertex#image-models) | `imagen-4.0-ultra-generate-001` | 1:1, 3:4, 4:3, 9:16, 16:9 |
| [Google Vertex](/providers/ai-sdk-providers/google-vertex#image-models) | `imagen-3.0-fast-generate-001` | 1:1, 3:4, 4:3, 9:16, 16:9 |
| [Fireworks](/providers/ai-sdk-providers/fireworks#image-models) | `accounts/fireworks/models/flux-1-dev-fp8` | 1:1, 2:3, 3:2, 4:5, 5:4, 16:9, 9:16, 9:21, 21:9 |
| [Fireworks](/providers/ai-sdk-providers/fireworks#image-models) | `accounts/fireworks/models/flux-1-schnell-fp8` | 1:1, 2:3, 3:2, 4:5, 5:4, 16:9, 9:16, 9:21, 21:9 |
| [Fireworks](/providers/ai-sdk-providers/fireworks#image-models) | `accounts/fireworks/models/playground-v2-5-1024px-aesthetic` | 640x1536, 768x1344, 832x1216, 896x1152, 1024x1024, 1152x896, 1216x832, 1344x768, 1536x640 |
| [Fireworks](/providers/ai-sdk-providers/fireworks#image-models) | `accounts/fireworks/models/japanese-stable-diffusion-xl` | 640x1536, 768x1344, 832x1216, 896x1152, 1024x1024, 1152x896, 1216x832, 1344x768, 1536x640 |
| [Fireworks](/providers/ai-sdk-providers/fireworks#image-models) | `accounts/fireworks/models/playground-v2-1024px-aesthetic` | 640x1536, 768x1344, 832x1216, 896x1152, 1024x1024, 1152x896, 1216x832, 1344x768, 1536x640 |
| [Fireworks](/providers/ai-sdk-providers/fireworks#image-models) | `accounts/fireworks/models/SSD-1B` | 640x1536, 768x1344, 832x1216, 896x1152, 1024x1024, 1152x896, 1216x832, 1344x768, 1536x640 |
| [Fireworks](/providers/ai-sdk-providers/fireworks#image-models) | `accounts/fireworks/models/stable-diffusion-xl-1024-v1-0` | 640x1536, 768x1344, 832x1216, 896x1152, 1024x1024, 1152x896, 1216x832, 1344x768, 1536x640 |
| [Luma](/providers/ai-sdk-providers/luma#image-models) | `photon-1` | 1:1, 3:4, 4:3, 9:16, 16:9, 9:21, 21:9 |
| [Luma](/providers/ai-sdk-providers/luma#image-models) | `photon-flash-1` | 1:1, 3:4, 4:3, 9:16, 16:9, 9:21, 21:9 |
| [Together.ai](/providers/ai-sdk-providers/togetherai#image-models) | `stabilityai/stable-diffusion-xl-base-1.0` | 512x512, 768x768, 1024x1024 |
| [Together.ai](/providers/ai-sdk-providers/togetherai#image-models) | `black-forest-labs/FLUX.1-dev` | 512x512, 768x768, 1024x1024 |
| [Together.ai](/providers/ai-sdk-providers/togetherai#image-models) | `black-forest-labs/FLUX.1-dev-lora` | 512x512, 768x768, 1024x1024 |
| [Together.ai](/providers/ai-sdk-providers/togetherai#image-models) | `black-forest-labs/FLUX.1-schnell` | 512x512, 768x768, 1024x1024 |
| [Together.ai](/providers/ai-sdk-providers/togetherai#image-models) | `black-forest-labs/FLUX.1-canny` | 512x512, 768x768, 1024x1024 |
| [Together.ai](/providers/ai-sdk-providers/togetherai#image-models) | `black-forest-labs/FLUX.1-depth` | 512x512, 768x768, 1024x1024 |
| [Together.ai](/providers/ai-sdk-providers/togetherai#image-models) | `black-forest-labs/FLUX.1-redux` | 512x512, 768x768, 1024x1024 |
| [Together.ai](/providers/ai-sdk-providers/togetherai#image-models) | `black-forest-labs/FLUX.1.1-pro` | 512x512, 768x768, 1024x1024 |
| [Together.ai](/providers/ai-sdk-providers/togetherai#image-models) | `black-forest-labs/FLUX.1-pro` | 512x512, 768x768, 1024x1024 |
| [Together.ai](/providers/ai-sdk-providers/togetherai#image-models) | `black-forest-labs/FLUX.1-schnell-Free` | 512x512, 768x768, 1024x1024 |
| [Black Forest Labs](/providers/ai-sdk-providers/black-forest-labs#image-models) | `flux-kontext-pro` | From 3:7 (portrait) to 7:3 (landscape) |
| [Black Forest Labs](/providers/ai-sdk-providers/black-forest-labs#image-models) | `flux-kontext-max` | From 3:7 (portrait) to 7:3 (landscape) |
| [Black Forest Labs](/providers/ai-sdk-providers/black-forest-labs#image-models) | `flux-pro-1.1-ultra` | From 3:7 (portrait) to 7:3 (landscape) |
| [Black Forest Labs](/providers/ai-sdk-providers/black-forest-labs#image-models) | `flux-pro-1.1` | From 3:7 (portrait) to 7:3 (landscape) |
| [Black Forest Labs](/providers/ai-sdk-providers/black-forest-labs#image-models) | `flux-pro-1.0-fill` | From 3:7 (portrait) to 7:3 (landscape) |
Above are a small subset of the image models supported by the AI SDK providers. For more, see the respective provider documentation.