@mastra/core
Version:
Mastra is a framework for building AI-powered applications and agents with a modern TypeScript stack.
121 lines (79 loc) • 4.03 kB
Markdown
# CompositeVoice
The CompositeVoice class allows you to combine different voice providers for text-to-speech and speech-to-text operations. This is particularly useful when you want to use the best provider for each operation - for example, using OpenAI for speech-to-text and PlayAI for text-to-speech.
CompositeVoice supports both Mastra voice providers and AI SDK model providers
## Constructor parameters
**config** (`object`): Configuration object for the composite voice service
**config.input** (`MastraVoice | TranscriptionModel`): Voice provider or AI SDK transcription model to use for speech-to-text operations. AI SDK models are automatically wrapped.
**config.output** (`MastraVoice | SpeechModel`): Voice provider or AI SDK speech model to use for text-to-speech operations. AI SDK models are automatically wrapped.
**config.realtime** (`MastraVoice`): Voice provider to use for real-time speech-to-speech operations
## Methods
### `speak()`
Converts text to speech using the configured speaking provider.
**input** (`string | NodeJS.ReadableStream`): Text to convert to speech
**options** (`object`): Provider-specific options passed to the speaking provider
Notes:
- If no speaking provider is configured, this method will throw an error
- Options are passed through to the configured speaking provider
- Returns a stream of audio data
### `listen()`
Converts speech to text using the configured listening provider.
**audioStream** (`NodeJS.ReadableStream`): Audio stream to convert to text
**options** (`object`): Provider-specific options passed to the listening provider
Notes:
- If no listening provider is configured, this method will throw an error
- Options are passed through to the configured listening provider
- Returns either a string or a stream of transcribed text, depending on the provider
### `getSpeakers()`
Returns a list of available voices from the speaking provider, where each node contains:
**voiceId** (`string`): Unique identifier for the voice
**key** (`value`): Additional voice properties that vary by provider (e.g., name, language)
Notes:
- Returns voices from the speaking provider only
- If no speaking provider is configured, returns an empty array
- Each voice object will have at least a voiceId property
- Additional voice properties depend on the speaking provider
## Usage examples
### Using Mastra Voice Providers
```typescript
import { CompositeVoice } from '@mastra/core/voice'
import { OpenAIVoice } from '@mastra/voice-openai'
import { PlayAIVoice } from '@mastra/voice-playai'
// Create voice providers
const openai = new OpenAIVoice()
const playai = new PlayAIVoice()
// Use OpenAI for listening (speech-to-text) and PlayAI for speaking (text-to-speech)
const voice = new CompositeVoice({
input: openai,
output: playai,
})
// Convert speech to text using OpenAI
const text = await voice.listen(audioStream)
// Convert text to speech using PlayAI
const audio = await voice.speak('Hello, world!')
```
### Using AI SDK Model Providers
You can pass AI SDK transcription and speech models directly to CompositeVoice:
```typescript
import { CompositeVoice } from '@mastra/core/voice'
import { openai } from '@ai-sdk/openai'
import { elevenlabs } from '@ai-sdk/elevenlabs'
// Use AI SDK models directly - they will be auto-wrapped
const voice = new CompositeVoice({
input: openai.transcription('whisper-1'), // AI SDK transcription
output: elevenlabs.speech('eleven_turbo_v2'), // AI SDK speech
})
// Works the same way as with Mastra providers
const text = await voice.listen(audioStream)
const audio = await voice.speak('Hello from AI SDK!')
```
### Mix and Match
You can combine Mastra providers with AI SDK models:
```typescript
import { CompositeVoice } from '@mastra/core/voice'
import { PlayAIVoice } from '@mastra/voice-playai'
import { groq } from '@ai-sdk/groq'
const voice = new CompositeVoice({
input: groq.transcription('whisper-large-v3'), // AI SDK for STT
output: new PlayAIVoice(), // Mastra for TTS
})
```