UNPKG

speech-provider

Version:

A unified interface for browser speech synthesis and Eleven Labs voices

197 lines (149 loc) 5.56 kB
# speech-provider A unified interface for browser speech synthesis and Eleven Labs voices. ## Installation ```bash # Using npm npm install speech-provider # Using yarn yarn add speech-provider # Using bun bun add speech-provider ``` ## Documentation Full API documentation is available at [https://osteele.github.io/speech-provider/](https://osteele.github.io/speech-provider/). ## Usage ```typescript import { getVoiceProvider } from 'speech-provider'; // Use browser voices only const provider = getVoiceProvider({}); // Use Eleven Labs voices if API key is available const provider = getVoiceProvider({ elevenLabsApiKey: 'your-api-key' }); // Use Eleven Labs with custom cache duration const provider = getVoiceProvider({ elevenLabsApiKey: 'your-api-key', cacheMaxAge: 86400 // Cache for 1 day }); // Get voices for a specific language const voices = await provider.getVoices({ lang: 'en-US', minVoices: 1 }); // Get default voice for a language const defaultVoice = await provider.getDefaultVoice({ lang: 'en-US' }); // Create and play an utterance if (defaultVoice) { const utterance = defaultVoice.createUtterance('Hello, world!'); utterance.onstart = () => console.log('Started speaking'); utterance.onend = () => console.log('Finished speaking'); utterance.start(); } ``` ## Features - Unified interface for both browser speech synthesis and Eleven Labs voices - Automatic fallback to browser voices when Eleven Labs API key is not provided - Typesafe API with TypeScript support - Simple voice selection by language - Event listeners for speech start and end events - Efficient caching of Eleven Labs API responses using the browser's Cache API - Configurable cache duration for Eleven Labs responses ## Used In This package is used in [Mandarin Sentence Practice](https://mandarin-sentence-practice.osteele.com), a web application for practicing Mandarin Chinese with listening and translation exercises. The app uses this package to provide high-quality text-to-speech for Mandarin sentences, with automatic fallback to browser voices when Eleven Labs is not available. ## Examples The package includes an interactive example in the `examples` directory that demonstrates both browser and Eleven Labs voice providers. To run it: 1. View the [live demo](https://osteele.github.io/speech-provider/examples/demo.html), or 2. Open `examples/demo.html` directly in a browser, or 3. Run `bunx serve examples` and open http://localhost:3000/demo.html The example includes: - API key management for Eleven Labs - Provider selection (Browser/Eleven Labs) - Language selection with system language detection - Voice selection with descriptions - Example sentences in multiple languages - Text-to-speech controls ## API ### `getVoiceProvider(options)` Creates a voice provider based on the available API keys. Falls back to browser speech synthesis if no API keys are provided. ```typescript function getVoiceProvider(options: { elevenLabsApiKey?: string | null; cacheMaxAge?: number | null; // Cache duration in seconds (default: 1 hour). Set to null to disable caching. }): VoiceProvider; ``` ### `createElevenLabsVoiceProvider(apiKey, options?)` Creates an Eleven Labs voice provider with optional configuration. ```typescript function createElevenLabsVoiceProvider( apiKey: string, baseUrl?: string, options?: { validateResponses?: boolean; printVoiceProperties?: boolean; cacheMaxAge?: number | null; // Cache duration in seconds (default: 1 hour). Set to null to disable caching. } ): VoiceProvider; ``` ### Caching The library implements efficient caching for Eleven Labs API responses using the browser's Cache API: - Browser voices are cached automatically by the browser's speech synthesis engine - Eleven Labs responses are cached using the browser's Cache API with a default duration of 1 hour - Cache duration can be configured when creating the provider - Cached responses are automatically invalidated after the specified duration - Cache can be disabled by setting `cacheMaxAge: null` in the provider options - The Cache API provides better performance than IndexedDB for network requests Examples of cache configuration: ```typescript // Use default 1-hour cache const provider = getVoiceProvider({ elevenLabsApiKey: 'your-api-key' }); // Cache for 1 day const provider = getVoiceProvider({ elevenLabsApiKey: 'your-api-key', cacheMaxAge: 86400 // 24 hours in seconds }); // Cache for 1 week const provider = getVoiceProvider({ elevenLabsApiKey: 'your-api-key', cacheMaxAge: 604800 // 7 days in seconds }); // Disable caching (preferred approach) const provider = getVoiceProvider({ elevenLabsApiKey: 'your-api-key', cacheMaxAge: null }); // Alternative way to disable caching const provider = getVoiceProvider({ elevenLabsApiKey: 'your-api-key', cacheMaxAge: 0 }); ``` ### `VoiceProvider` Interface ```typescript interface VoiceProvider { name: string; getVoices({ lang, minVoices }: { lang: string; minVoices: number }): Promise<Voice[]>; getDefaultVoice({ lang }: { lang: string }): Promise<Voice | null>; } ``` ### `Voice` Interface ```typescript interface Voice { name: string; id: string; lang: string; provider: VoiceProvider; description: string | null; createUtterance(text: string): Utterance; } ``` ### `Utterance` Interface ```typescript interface Utterance { start(): void; stop(): void; set onstart(callback: () => void); set onend(callback: () => void); } ``` ## License Copyright 2025 by Oliver Steele MIT