UNPKG

sofya.transcription

Version:

a JavaScript library that provides a robust and flexible solution for real-time audio transcription. It is designed to transcribe audio streams and can be easily integrated into web applications.

268 lines (202 loc) 7.16 kB
# Sofya Transcription **Sofya Transcription** is a JavaScript library that provides a robust and flexible solution for real-time audio transcription. It is designed to transcribe audio streams and can be easily integrated into web applications. The library also includes a functionality for capturing audio from media elements. ## Features - **Real-Time Transcription**: Transcribe audio streams in real time with high accuracy. - **Flexible Integration**: Seamlessly integrates with your web applications. - **Media Element Audio Capture**: Feature to capture audio from media elements like `<video>` and `<audio>`. - **Multiple Provider Support**: Support for Sofya Compliance and Sofya as Service transcription providers. - **Type-Safe Configuration**: TypeScript definitions for provider-specific configurations. ## Installation To install **Sofya Transcription**, you can use npm: `npm install sofya.transcription` ## Usage Here's a basic example of how to use **Sofya Transcription** in your project: 1. **Import the Library**: `import { MediaElementAudioCapture, SofyaTranscriber } from 'sofya.transcription';` 2. **Create a Transcription Service Instance**: ```typescript // Using API key connection const transcriber = new SofyaTranscriber({ apiKey: 'YOUR_API_KEY', config: { language: 'en-US' } }); // Or using a specific provider const transcriber = new SofyaTranscriber({ provider: 'sofya_compliance', endpoint: 'YOUR_ENDPOINT', config: { language: 'en-US', token: 'YOUR_TOKEN', compartmentId: 'YOUR_COMPARTMENT_ID', region: 'YOUR_REGION' } }); ``` 3. **Initialize and Start Transcription**: ```typescript // Wait for the transcriber to be ready transcriber.on('ready', () => { // Get media stream navigator.mediaDevices.getUserMedia({ audio: true }) .then(mediaStream => { // Start transcription transcriber.startTranscription(mediaStream); }) .catch(error => { console.error('Error accessing microphone:', error); }); }); ``` 4. **Handle Transcription Events**: ```typescript transcriber.on('recognizing', (text) => { console.log('Recognizing: ' + text); }); transcriber.on('recognized', (text) => { console.log('Recognized: ' + text); }); transcriber.on('error', (error) => { console.error('Transcription error:', error); }); transcriber.on('stopped', () => { console.log('Transcription stopped'); }); ``` 5. **Control Transcription**: ```typescript // Pause transcription transcriber.pauseTranscription(); // Resume transcription transcriber.resumeTranscription(); // Stop transcription await transcriber.stopTranscription(); ``` ## API ### `SofyaTranscriber` - **constructor(connection: Connection)**: Creates a new instance of the transcription service with a connection object. - **startTranscription(mediaStream: MediaStream): void**: Starts the transcription process with a given `MediaStream`. - **stopTranscription(): void**: Stops the transcription process. - **pauseTranscription(): void**: Pauses the transcription process. - **resumeTranscription(): void**: Resumes the transcription process. - **on(event: string, callback: Function): this**: Registers an event handler for transcription events. Possible events include: - `recognizing`: Fired when transcription is in progress. - `recognized`: Fired when transcription is complete. - `error`: Fired when an error occurs. - `ready`: Fired when the transcription service is ready to start. - `stopped`: Fired when the transcription process is stopped. - `connected`: Fired when the transcription service is connected to the provider. ### Connection Types The SDK supports different connection modes based on the provider: #### API Key Connection ```typescript { apiKey: string; config?: BaseConfig; } ``` #### Sofya Compliance Provider Connection ```typescript { provider: "sofya_compliance"; endpoint: string; config: SofyaComplianceConfig; } ``` #### Sofya As Service Provider Connection ```typescript { provider: "sofya_as_service"; endpoint: string; config: SofyaSpeechConfig; } ``` #### STT WVAD Provider Connection ```typescript { provider: "stt_wvad"; endpoint: string; config: SofyaSpeechConfig; } ``` ### Configuration Types #### BaseConfig ```typescript interface BaseConfig { language: string; } ``` #### SofyaComplianceConfig ```typescript interface SofyaComplianceConfig extends BaseConfig { token: string; compartmentId: string; region: string; } ``` #### SofyaSpeechConfig ```typescript interface SofyaSpeechConfig extends BaseConfig {} ``` ## React Example ```jsx import React from 'react' import { SofyaTranscriber } from 'sofya.transcription' const App = () => { const transcriberRef = React.useRef<SofyaTranscriber | null>(null) const [transcription, setTranscription] = React.useState('') const transcriptionRef = React.useRef('') const getMediaStream = async () => { const stream = await navigator.mediaDevices.getUserMedia({ audio: true }) return stream } const startTranscription = async () => { try { const stream = await getMediaStream() // Create transcriber with API key connection const transcriber = new SofyaTranscriber({ apiKey: 'your_api_key', config: { language: 'en-US' } }) transcriberRef.current = transcriber transcriber.on("ready", () => { transcriber.startTranscription(stream) }) transcriber.on('recognizing', (result: string) => { transcriptionRef.current = result setTranscription(result) }) transcriber.on('recognized', (result: string) => { transcriptionRef.current = result setTranscription(result) }) transcriber.on('error', (error: Error) => { console.error('Transcription error:', error) }) } catch (error) { console.error('Error starting transcription:', error) } } const stopTranscription = async () => { if (transcriberRef.current) { await transcriberRef.current.stopTranscription() } } return ( <div> <button onClick={startTranscription}>Start Transcription</button> <button onClick={stopTranscription}>Stop Transcription</button> <div> <h3>Transcription:</h3> <p>{transcription}</p> </div> </div> ) } export default App ``` ## License This project is licensed under the MIT License - see the LICENSE file for details.