@ai-sdk/deepinfra
Version:
The **[DeepInfra provider](https://ai-sdk.dev/providers/ai-sdk-providers/deepinfra)** for the [AI SDK](https://ai-sdk.dev/docs) contains language model support for the DeepInfra API, giving you access to models like Llama 3, Mixtral, and other state-of-th
293 lines (225 loc) • 13.2 kB
text/mdx
---
title: DeepInfra
description: Learn how to use DeepInfra's models with the AI SDK.
---
# DeepInfra Provider
The [DeepInfra](https://deepinfra.com) provider contains support for state-of-the-art models through the DeepInfra API, including Llama 3, Mixtral, Qwen, and many other popular open-source models.
## Setup
The DeepInfra provider is available via the `@ai-sdk/deepinfra` module. You can install it with:
<Tabs items={['pnpm', 'npm', 'yarn', 'bun']}>
<Tab>
<Snippet text="pnpm add @ai-sdk/deepinfra" dark />
</Tab>
<Tab>
<Snippet text="npm install @ai-sdk/deepinfra" dark />
</Tab>
<Tab>
<Snippet text="yarn add @ai-sdk/deepinfra" dark />
</Tab>
<Tab>
<Snippet text="bun add @ai-sdk/deepinfra" dark />
</Tab>
</Tabs>
## Provider Instance
You can import the default provider instance `deepinfra` from `@ai-sdk/deepinfra`:
```ts
import { deepinfra } from '@ai-sdk/deepinfra';
```
If you need a customized setup, you can import `createDeepInfra` from `@ai-sdk/deepinfra` and create a provider instance with your settings:
```ts
import { createDeepInfra } from '@ai-sdk/deepinfra';
const deepinfra = createDeepInfra({
apiKey: process.env.DEEPINFRA_API_KEY ?? '',
});
```
You can use the following optional settings to customize the DeepInfra provider instance:
- **baseURL** _string_
Use a different URL prefix for API calls, e.g. to use proxy servers.
The default prefix is `https://api.deepinfra.com/v1`.
Note: Language models and embeddings use OpenAI-compatible endpoints at `{baseURL}/openai`,
while image models use `{baseURL}/inference`.
- **apiKey** _string_
API key that is being sent using the `Authorization` header. It defaults to
the `DEEPINFRA_API_KEY` environment variable.
- **headers** _Record<string,string>_
Custom headers to include in the requests.
- **fetch** _(input: RequestInfo, init?: RequestInit) => Promise<Response>_
Custom [fetch](https://developer.mozilla.org/en-US/docs/Web/API/fetch) implementation.
Defaults to the global `fetch` function.
You can use it as a middleware to intercept requests,
or to provide a custom fetch implementation for e.g. testing.
## Language Models
You can create language models using a provider instance. The first argument is the model ID, for example:
```ts
import { deepinfra } from '@ai-sdk/deepinfra';
import { generateText } from 'ai';
const { text } = await generateText({
model: deepinfra('meta-llama/Meta-Llama-3.1-70B-Instruct'),
prompt: 'Write a vegetarian lasagna recipe for 4 people.',
});
```
DeepInfra language models can also be used in the `streamText` function (see [AI SDK Core](/docs/ai-sdk-core)).
## Model Capabilities
| Model | Image Input | Object Generation | Tool Usage | Tool Streaming |
| --------------------------------------------------- | ------------------- | ------------------- | ------------------- | ------------------- |
| `meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8` | <Check size={18} /> | <Cross size={18} /> | <Cross size={18} /> | <Cross size={18} /> |
| `meta-llama/Llama-4-Scout-17B-16E-Instruct` | <Check size={18} /> | <Cross size={18} /> | <Cross size={18} /> | <Cross size={18} /> |
| `meta-llama/Llama-3.3-70B-Instruct-Turbo` | <Cross size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> |
| `meta-llama/Llama-3.3-70B-Instruct` | <Cross size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> |
| `meta-llama/Meta-Llama-3.1-405B-Instruct` | <Cross size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> |
| `meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo` | <Cross size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> |
| `meta-llama/Meta-Llama-3.1-70B-Instruct` | <Cross size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> |
| `meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo` | <Cross size={18} /> | <Check size={18} /> | <Check size={18} /> | <Cross size={18} /> |
| `meta-llama/Meta-Llama-3.1-8B-Instruct` | <Cross size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> |
| `meta-llama/Llama-3.2-11B-Vision-Instruct` | <Check size={18} /> | <Check size={18} /> | <Cross size={18} /> | <Cross size={18} /> |
| `meta-llama/Llama-3.2-90B-Vision-Instruct` | <Check size={18} /> | <Check size={18} /> | <Cross size={18} /> | <Cross size={18} /> |
| `mistralai/Mixtral-8x7B-Instruct-v0.1` | <Cross size={18} /> | <Check size={18} /> | <Check size={18} /> | <Cross size={18} /> |
| `deepseek-ai/DeepSeek-V3` | <Cross size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> |
| `deepseek-ai/DeepSeek-R1` | <Cross size={18} /> | <Cross size={18} /> | <Cross size={18} /> | <Cross size={18} /> |
| `deepseek-ai/DeepSeek-R1-Distill-Llama-70B` | <Cross size={18} /> | <Cross size={18} /> | <Cross size={18} /> | <Cross size={18} /> |
| `deepseek-ai/DeepSeek-R1-Turbo` | <Cross size={18} /> | <Cross size={18} /> | <Cross size={18} /> | <Cross size={18} /> |
| `nvidia/Llama-3.1-Nemotron-70B-Instruct` | <Cross size={18} /> | <Check size={18} /> | <Check size={18} /> | <Cross size={18} /> |
| `Qwen/Qwen2-7B-Instruct` | <Cross size={18} /> | <Check size={18} /> | <Cross size={18} /> | <Cross size={18} /> |
| `Qwen/Qwen2.5-72B-Instruct` | <Cross size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> |
| `Qwen/Qwen2.5-Coder-32B-Instruct` | <Cross size={18} /> | <Check size={18} /> | <Cross size={18} /> | <Cross size={18} /> |
| `Qwen/QwQ-32B-Preview` | <Cross size={18} /> | <Check size={18} /> | <Cross size={18} /> | <Cross size={18} /> |
| `google/codegemma-7b-it` | <Cross size={18} /> | <Cross size={18} /> | <Cross size={18} /> | <Cross size={18} /> |
| `google/gemma-2-9b-it` | <Cross size={18} /> | <Cross size={18} /> | <Cross size={18} /> | <Cross size={18} /> |
| `microsoft/WizardLM-2-8x22B` | <Cross size={18} /> | <Cross size={18} /> | <Cross size={18} /> | <Cross size={18} /> |
<Note>
The table above lists popular models. Please see the [DeepInfra
docs](https://deepinfra.com) for a full list of available models. You can also
pass any available provider model ID as a string if needed.
</Note>
## Image Models
You can create DeepInfra image models using the `.image()` factory method.
For more on image generation with the AI SDK see [generateImage()](/docs/reference/ai-sdk-core/generate-image).
```ts
import { deepinfra } from '@ai-sdk/deepinfra';
import { generateImage } from 'ai';
const { image } = await generateImage({
model: deepinfra.image('stabilityai/sd3.5'),
prompt: 'A futuristic cityscape at sunset',
aspectRatio: '16:9',
});
```
<Note>
Model support for `size` and `aspectRatio` parameters varies by model. Please
check the individual model documentation on [DeepInfra's models
page](https://deepinfra.com/models/text-to-image) for supported options and
additional parameters.
</Note>
### Model-specific options
You can pass model-specific parameters using the `providerOptions.deepinfra` field:
```ts
import { deepinfra } from '@ai-sdk/deepinfra';
import { generateImage } from 'ai';
const { image } = await generateImage({
model: deepinfra.image('stabilityai/sd3.5'),
prompt: 'A futuristic cityscape at sunset',
aspectRatio: '16:9',
providerOptions: {
deepinfra: {
num_inference_steps: 30, // Control the number of denoising steps (1-50)
},
},
});
```
### Image Editing
DeepInfra supports image editing through models like `Qwen/Qwen-Image-Edit`. Pass input images via `prompt.images` to transform or edit existing images.
#### Basic Image Editing
Transform an existing image using text prompts:
```ts
const imageBuffer = readFileSync('./input-image.png');
const { images } = await generateImage({
model: deepinfra.image('Qwen/Qwen-Image-Edit'),
prompt: {
text: 'Turn the cat into a golden retriever dog',
images: [imageBuffer],
},
size: '1024x1024',
});
```
#### Inpainting with Mask
Edit specific parts of an image using a mask. Transparent areas in the mask indicate where the image should be edited:
```ts
const image = readFileSync('./input-image.png');
const mask = readFileSync('./mask.png');
const { images } = await generateImage({
model: deepinfra.image('Qwen/Qwen-Image-Edit'),
prompt: {
text: 'A sunlit indoor lounge area with a pool containing a flamingo',
images: [image],
mask: mask,
},
});
```
#### Multi-Image Combining
Combine multiple reference images into a single output:
```ts
const cat = readFileSync('./cat.png');
const dog = readFileSync('./dog.png');
const { images } = await generateImage({
model: deepinfra.image('Qwen/Qwen-Image-Edit'),
prompt: {
text: 'Create a scene with both animals together, playing as friends',
images: [cat, dog],
},
});
```
<Note>
Input images can be provided as `Buffer`, `ArrayBuffer`, `Uint8Array`, or
base64-encoded strings. DeepInfra uses an OpenAI-compatible image editing API
at `https://api.deepinfra.com/v1/openai/images/edits`.
</Note>
### Model Capabilities
For models supporting aspect ratios, the following ratios are typically supported:
`1:1 (default), 16:9, 1:9, 3:2, 2:3, 4:5, 5:4, 9:16, 9:21`
For models supporting size parameters, dimensions must typically be:
- Multiples of 32
- Width and height between 256 and 1440 pixels
- Default size is 1024x1024
| Model | Dimensions Specification | Notes |
| ---------------------------------- | ------------------------ | -------------------------------------------------------- |
| `stabilityai/sd3.5` | Aspect Ratio | Premium quality base model, 8B parameters |
| `black-forest-labs/FLUX-1.1-pro` | Size | Latest state-of-art model with superior prompt following |
| `black-forest-labs/FLUX-1-schnell` | Size | Fast generation in 1-4 steps |
| `black-forest-labs/FLUX-1-dev` | Size | Optimized for anatomical accuracy |
| `black-forest-labs/FLUX-pro` | Size | Flagship Flux model |
| `stabilityai/sd3.5-medium` | Aspect Ratio | Balanced 2.5B parameter model |
| `stabilityai/sdxl-turbo` | Aspect Ratio | Optimized for fast generation |
For more details and pricing information, see the [DeepInfra text-to-image models page](https://deepinfra.com/models/text-to-image).
## Embedding Models
You can create DeepInfra embedding models using the `.embedding()` factory method.
For more on embedding models with the AI SDK see [embed()](/docs/reference/ai-sdk-core/embed).
```ts
import { deepinfra } from '@ai-sdk/deepinfra';
import { embed } from 'ai';
const { embedding } = await embed({
model: deepinfra.embedding('BAAI/bge-large-en-v1.5'),
value: 'sunny day at the beach',
});
```
### Model Capabilities
| Model | Dimensions | Max Tokens |
| ----------------------------------------------------- | ---------- | ---------- |
| `BAAI/bge-base-en-v1.5` | 768 | 512 |
| `BAAI/bge-large-en-v1.5` | 1024 | 512 |
| `BAAI/bge-m3` | 1024 | 8192 |
| `intfloat/e5-base-v2` | 768 | 512 |
| `intfloat/e5-large-v2` | 1024 | 512 |
| `intfloat/multilingual-e5-large` | 1024 | 512 |
| `sentence-transformers/all-MiniLM-L12-v2` | 384 | 256 |
| `sentence-transformers/all-MiniLM-L6-v2` | 384 | 256 |
| `sentence-transformers/all-mpnet-base-v2` | 768 | 384 |
| `sentence-transformers/clip-ViT-B-32` | 512 | 77 |
| `sentence-transformers/clip-ViT-B-32-multilingual-v1` | 512 | 77 |
| `sentence-transformers/multi-qa-mpnet-base-dot-v1` | 768 | 512 |
| `sentence-transformers/paraphrase-MiniLM-L6-v2` | 384 | 128 |
| `shibing624/text2vec-base-chinese` | 768 | 512 |
| `thenlper/gte-base` | 768 | 512 |
| `thenlper/gte-large` | 1024 | 512 |
<Note>
For a complete list of available embedding models, see the [DeepInfra
embeddings page](https://deepinfra.com/models/embeddings).
</Note>