@copilotkit/runtime
Version:
<img src="https://github.com/user-attachments/assets/0a6b64d9-e193-4940-a3f6-60334ac34084" alt="banner" style="border-radius: 12px; border: 2px solid #d6d4fa;" />
288 lines (224 loc) • 7.74 kB
Markdown
# CopilotKit Transcription
Subclass `TranscriptionService`, pass an instance to `CopilotRuntime({ transcriptionService })`,
and the `POST /transcribe` endpoint lights up. The service has a single method,
`transcribeFile`, that returns the transcript as a plain string.
## Setup
```typescript
import {
CopilotRuntime,
createCopilotRuntimeHandler,
TranscriptionService,
type TranscribeFileOptions,
} from "@copilotkit/runtime/v2";
import OpenAI from "openai";
class OpenAIWhisperTranscription extends TranscriptionService {
private client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
async transcribeFile({ audioFile }: TranscribeFileOptions): Promise<string> {
const result = await this.client.audio.transcriptions.create({
file: audioFile,
model: "whisper-1",
});
return result.text;
}
}
const runtime = new CopilotRuntime({
agents: {
/* ... */
} as any,
transcriptionService: new OpenAIWhisperTranscription(),
});
const handler = createCopilotRuntimeHandler({
runtime,
basePath: "/api/copilotkit",
});
export default { fetch: handler };
```
## Core Patterns
### Abstract contract
```typescript
// packages/runtime/src/v2/runtime/transcription-service/transcription-service.ts
export interface TranscribeFileOptions {
audioFile: File;
mimeType?: string;
size?: number;
}
export abstract class TranscriptionService {
abstract transcribeFile(options: TranscribeFileOptions): Promise<string>;
}
```
### Supported request shapes
Multipart (REST mode):
```typescript
const form = new FormData();
form.append("audio", blob, "recording.webm");
await fetch("/api/copilotkit/transcribe", { method: "POST", body: form });
```
JSON (works in both multi-route and single-endpoint modes — dispatch is by
`Content-Type: application/json`; `mimeType` is required in the payload):
```typescript
await fetch("/api/copilotkit/transcribe", {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({
audio: base64String,
mimeType: "audio/webm",
filename: "recording.webm", // optional
}),
});
```
### Reject oversize audio with a graceful 400
```typescript
class OpenAIWhisperTranscription extends TranscriptionService {
private client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
async transcribeFile({
audioFile,
size,
}: TranscribeFileOptions): Promise<string> {
const max = 25 * 1024 * 1024; // 25 MB
if ((size ?? audioFile.size) > max) {
// "too long" keyword → audio_too_long response
throw new Error("Audio duration too long — max 25MB per upload");
}
const result = await this.client.audio.transcriptions.create({
file: audioFile,
model: "whisper-1",
});
return result.text;
}
}
```
### Error auto-categorization
The runtime inspects `String(error).toLowerCase()` thrown by your service and maps keywords
to error codes. Let the provider error bubble up — do not re-categorize inside the service.
| Keyword substrings | Maps to |
| ---------------------------------------- | -------------------------------- |
| `rate`, `429`, `too many` | `rate_limited` (retryable) |
| `auth`, `401`, `api key`, `unauthorized` | `auth_failed` (not retryable) |
| `too long`, `duration`, `length` | `audio_too_long` (not retryable) |
| (anything else) | `provider_error` (retryable) |
Full error-code enum:
```typescript
// packages/shared/src/transcription-errors.ts
export enum TranscriptionErrorCode {
SERVICE_NOT_CONFIGURED = "service_not_configured",
INVALID_AUDIO_FORMAT = "invalid_audio_format",
AUDIO_TOO_LONG = "audio_too_long",
AUDIO_TOO_SHORT = "audio_too_short",
RATE_LIMITED = "rate_limited",
AUTH_FAILED = "auth_failed",
PROVIDER_ERROR = "provider_error",
NETWORK_ERROR = "network_error",
INVALID_REQUEST = "invalid_request",
}
```
## Common Mistakes
### HIGH Calling /transcribe without configuring transcriptionService
Wrong:
```typescript
new CopilotRuntime({ agents });
// client calls /api/copilotkit/transcribe → 503
```
Correct:
```typescript
new CopilotRuntime({
agents,
transcriptionService: new MyWhisperService(),
});
```
Unconfigured runtime returns HTTP 503 with
`{ error: "service_not_configured" }`. The frontend gets no transcript with no obvious
server-side failure.
Source: `packages/runtime/src/v2/runtime/handlers/handle-transcribe.ts:203-207`.
### MEDIUM Form field named "file" instead of "audio"
Wrong:
```typescript
const form = new FormData();
form.append("file", blob, "recording.webm");
await fetch("/api/copilotkit/transcribe", { method: "POST", body: form });
```
Correct:
```typescript
const form = new FormData();
form.append("audio", blob, "recording.webm");
await fetch("/api/copilotkit/transcribe", { method: "POST", body: form });
```
The handler reads `formData.get("audio")` — any other field name yields `null` and returns
`invalid_request`.
Source: `packages/runtime/src/v2/runtime/handlers/handle-transcribe.ts:91-97`.
### MEDIUM Base64 payload missing mimeType
Wrong:
```typescript
await fetch("/api/copilotkit/transcribe", {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ audio: b64 }),
});
```
Correct:
```typescript
await fetch("/api/copilotkit/transcribe", {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ audio: b64, mimeType: "audio/webm" }),
});
```
JSON mode requires `mimeType` — the handler explicitly rejects payloads missing it with
`invalid_request`.
Source: `packages/runtime/src/v2/runtime/handlers/handle-transcribe.ts:131-136`.
### LOW Re-categorizing errors inside the service
Wrong:
```typescript
class MyService extends TranscriptionService {
async transcribeFile(opts: TranscribeFileOptions): Promise<string> {
try {
return await doTranscribe(opts);
} catch (e) {
// trying to hand-pick error codes
throw new Error("RATE_LIMITED");
}
}
}
```
Correct:
```typescript
class MyService extends TranscriptionService {
async transcribeFile(opts: TranscribeFileOptions): Promise<string> {
return doTranscribe(opts); // let provider errors bubble up verbatim
}
}
```
The runtime scans `String(error).toLowerCase()` for `"rate"`, `"429"`, `"auth"`, `"too long"`
etc. Provider-native messages (`"OpenAI returned 429 rate limited"`) auto-map to the right
code. Hand-crafted codes bypass the keyword matcher and end up as `provider_error`.
Source: `packages/runtime/src/v2/runtime/handlers/handle-transcribe.ts:160-196`.
### MEDIUM Returning a rich object instead of a string
Wrong:
```typescript
class MyService extends TranscriptionService {
async transcribeFile(opts: TranscribeFileOptions): Promise<string> {
// @ts-expect-error returning the wrong shape
return {
text: "hi",
segments: [
/* ... */
],
};
}
}
```
Correct:
```typescript
class MyService extends TranscriptionService {
async transcribeFile(opts: TranscribeFileOptions): Promise<string> {
const result = await provider.transcribe(opts.audioFile);
return result.text;
}
}
```
`transcribeFile` returns `Promise<string>`. The handler sends
`{ transcription: string }` back to the client — any other shape is a TypeScript error and
would be JSON-stringified wrongly at runtime.
Source: `packages/runtime/src/v2/runtime/transcription-service/transcription-service.ts:9-11`.
## See also
- `copilotkit/setup-endpoint` — `/transcribe` is one of the routes the handler mounts
- `copilotkit/debug-and-troubleshoot` — `TranscriptionErrorCode` catalog