@mastra/core

# CompositeVoice The CompositeVoice class allows you to combine different voice providers for text-to-speech and speech-to-text operations. This is particularly useful when you want to use the best provider for each operation - for example, using OpenAI for speech-to-text and PlayAI for text-to-speech. CompositeVoice supports both Mastra voice providers and AI SDK model providers ## Constructor parameters **config** (`object`): Configuration object for the composite voice service **config.input** (`MastraVoice | TranscriptionModel`): Voice provider or AI SDK transcription model to use for speech-to-text operations. AI SDK models are automatically wrapped. **config.output** (`MastraVoice | SpeechModel`): Voice provider or AI SDK speech model to use for text-to-speech operations. AI SDK models are automatically wrapped. **config.realtime** (`MastraVoice`): Voice provider to use for real-time speech-to-speech operations ## Methods ### `speak()` Converts text to speech using the configured speaking provider. **input** (`string | NodeJS.ReadableStream`): Text to convert to speech **options** (`object`): Provider-specific options passed to the speaking provider Notes: - If no speaking provider is configured, this method will throw an error - Options are passed through to the configured speaking provider - Returns a stream of audio data ### `listen()` Converts speech to text using the configured listening provider. **audioStream** (`NodeJS.ReadableStream`): Audio stream to convert to text **options** (`object`): Provider-specific options passed to the listening provider Notes: - If no listening provider is configured, this method will throw an error - Options are passed through to the configured listening provider - Returns either a string or a stream of transcribed text, depending on the provider ### `getSpeakers()` Returns a list of available voices from the speaking provider, where each node contains: **voiceId** (`string`): Unique identifier for the voice **key** (`value`): Additional voice properties that vary by provider (e.g., name, language) Notes: - Returns voices from the speaking provider only - If no speaking provider is configured, returns an empty array - Each voice object will have at least a voiceId property - Additional voice properties depend on the speaking provider ## Usage examples ### Using Mastra Voice Providers ```typescript import { CompositeVoice } from '@mastra/core/voice' import { OpenAIVoice } from '@mastra/voice-openai' import { PlayAIVoice } from '@mastra/voice-playai' // Create voice providers const openai = new OpenAIVoice() const playai = new PlayAIVoice() // Use OpenAI for listening (speech-to-text) and PlayAI for speaking (text-to-speech) const voice = new CompositeVoice({ input: openai, output: playai, }) // Convert speech to text using OpenAI const text = await voice.listen(audioStream) // Convert text to speech using PlayAI const audio = await voice.speak('Hello, world!') ``` ### Using AI SDK Model Providers You can pass AI SDK transcription and speech models directly to CompositeVoice: ```typescript import { CompositeVoice } from '@mastra/core/voice' import { openai } from '@ai-sdk/openai' import { elevenlabs } from '@ai-sdk/elevenlabs' // Use AI SDK models directly - they will be auto-wrapped const voice = new CompositeVoice({ input: openai.transcription('whisper-1'), // AI SDK transcription output: elevenlabs.speech('eleven_turbo_v2'), // AI SDK speech }) // Works the same way as with Mastra providers const text = await voice.listen(audioStream) const audio = await voice.speak('Hello from AI SDK!') ``` ### Mix and Match You can combine Mastra providers with AI SDK models: ```typescript import { CompositeVoice } from '@mastra/core/voice' import { PlayAIVoice } from '@mastra/voice-playai' import { groq } from '@ai-sdk/groq' const voice = new CompositeVoice({ input: groq.transcription('whisper-large-v3'), // AI SDK for STT output: new PlayAIVoice(), // Mastra for TTS }) ```