sofya.transcription
Version:
a JavaScript library that provides a robust and flexible solution for real-time audio transcription. It is designed to transcribe audio streams and can be easily integrated into web applications.
268 lines (202 loc) • 7.16 kB
Markdown
# Sofya Transcription
**Sofya Transcription** is a JavaScript library that provides a robust and flexible solution for real-time audio transcription. It is designed to transcribe audio streams and can be easily integrated into web applications. The library also includes a functionality for capturing audio from media elements.
## Features
- **Real-Time Transcription**: Transcribe audio streams in real time with high accuracy.
- **Flexible Integration**: Seamlessly integrates with your web applications.
- **Media Element Audio Capture**: Feature to capture audio from media elements like `<video>` and `<audio>`.
- **Multiple Provider Support**: Support for Sofya Compliance and Sofya as Service transcription providers.
- **Type-Safe Configuration**: TypeScript definitions for provider-specific configurations.
## Installation
To install **Sofya Transcription**, you can use npm:
`npm install sofya.transcription`
## Usage
Here's a basic example of how to use **Sofya Transcription** in your project:
1. **Import the Library**:
`import { MediaElementAudioCapture, SofyaTranscriber } from 'sofya.transcription';`
2. **Create a Transcription Service Instance**:
```typescript
// Using API key connection
const transcriber = new SofyaTranscriber({
apiKey: 'YOUR_API_KEY',
config: {
language: 'en-US'
}
});
// Or using a specific provider
const transcriber = new SofyaTranscriber({
provider: 'sofya_compliance',
endpoint: 'YOUR_ENDPOINT',
config: {
language: 'en-US',
token: 'YOUR_TOKEN',
compartmentId: 'YOUR_COMPARTMENT_ID',
region: 'YOUR_REGION'
}
});
```
3. **Initialize and Start Transcription**:
```typescript
// Wait for the transcriber to be ready
transcriber.on('ready', () => {
// Get media stream
navigator.mediaDevices.getUserMedia({ audio: true })
.then(mediaStream => {
// Start transcription
transcriber.startTranscription(mediaStream);
})
.catch(error => {
console.error('Error accessing microphone:', error);
});
});
```
4. **Handle Transcription Events**:
```typescript
transcriber.on('recognizing', (text) => {
console.log('Recognizing: ' + text);
});
transcriber.on('recognized', (text) => {
console.log('Recognized: ' + text);
});
transcriber.on('error', (error) => {
console.error('Transcription error:', error);
});
transcriber.on('stopped', () => {
console.log('Transcription stopped');
});
```
5. **Control Transcription**:
```typescript
// Pause transcription
transcriber.pauseTranscription();
// Resume transcription
transcriber.resumeTranscription();
// Stop transcription
await transcriber.stopTranscription();
```
## API
### `SofyaTranscriber`
- **constructor(connection: Connection)**: Creates a new instance of the transcription service with a connection object.
- **startTranscription(mediaStream: MediaStream): void**: Starts the transcription process with a given `MediaStream`.
- **stopTranscription(): void**: Stops the transcription process.
- **pauseTranscription(): void**: Pauses the transcription process.
- **resumeTranscription(): void**: Resumes the transcription process.
- **on(event: string, callback: Function): this**: Registers an event handler for transcription events. Possible events include:
- `recognizing`: Fired when transcription is in progress.
- `recognized`: Fired when transcription is complete.
- `error`: Fired when an error occurs.
- `ready`: Fired when the transcription service is ready to start.
- `stopped`: Fired when the transcription process is stopped.
- `connected`: Fired when the transcription service is connected to the provider.
### Connection Types
The SDK supports different connection modes based on the provider:
#### API Key Connection
```typescript
{
apiKey: string;
config?: BaseConfig;
}
```
#### Sofya Compliance Provider Connection
```typescript
{
provider: "sofya_compliance";
endpoint: string;
config: SofyaComplianceConfig;
}
```
#### Sofya As Service Provider Connection
```typescript
{
provider: "sofya_as_service";
endpoint: string;
config: SofyaSpeechConfig;
}
```
#### STT WVAD Provider Connection
```typescript
{
provider: "stt_wvad";
endpoint: string;
config: SofyaSpeechConfig;
}
```
### Configuration Types
#### BaseConfig
```typescript
interface BaseConfig {
language: string;
}
```
#### SofyaComplianceConfig
```typescript
interface SofyaComplianceConfig extends BaseConfig {
token: string;
compartmentId: string;
region: string;
}
```
#### SofyaSpeechConfig
```typescript
interface SofyaSpeechConfig extends BaseConfig {}
```
## React Example
```jsx
import React from 'react'
import { SofyaTranscriber } from 'sofya.transcription'
const App = () => {
const transcriberRef = React.useRef<SofyaTranscriber | null>(null)
const [transcription, setTranscription] = React.useState('')
const transcriptionRef = React.useRef('')
const getMediaStream = async () => {
const stream = await navigator.mediaDevices.getUserMedia({ audio: true })
return stream
}
const startTranscription = async () => {
try {
const stream = await getMediaStream()
// Create transcriber with API key connection
const transcriber = new SofyaTranscriber({
apiKey: 'your_api_key',
config: {
language: 'en-US'
}
})
transcriberRef.current = transcriber
transcriber.on("ready", () => {
transcriber.startTranscription(stream)
})
transcriber.on('recognizing', (result: string) => {
transcriptionRef.current = result
setTranscription(result)
})
transcriber.on('recognized', (result: string) => {
transcriptionRef.current = result
setTranscription(result)
})
transcriber.on('error', (error: Error) => {
console.error('Transcription error:', error)
})
} catch (error) {
console.error('Error starting transcription:', error)
}
}
const stopTranscription = async () => {
if (transcriberRef.current) {
await transcriberRef.current.stopTranscription()
}
}
return (
<div>
<button onClick={startTranscription}>Start Transcription</button>
<button onClick={stopTranscription}>Stop Transcription</button>
<div>
<h3>Transcription:</h3>
<p>{transcription}</p>
</div>
</div>
)
}
export default App
```
## License
This project is licensed under the MIT License - see the LICENSE file for details.