nodejs-whisper
Version:
Node bindings for OpenAI's Whisper. Optimized for CPU.
179 lines (134 loc) • 4.04 kB
Markdown
# nodejs-whisper
Node.js bindings for OpenAI's Whisper model.
[](https://choosealicense.com/licenses/mit/)
## Features
- Automatically convert the audio to WAV format with a 16000 Hz frequency to support the whisper model.
- Output transcripts to (.txt .srt .vtt .json .wts .lrc)
- Optimized for CPU (Including Apple Silicon ARM)
- Timestamp precision to single word
- Split on word rather than on token (Optional)
- Translate from source language to english (Optional)
- Convert audio format to wav to support whisper model
## Installation
1. Install make tools
```bash
sudo apt update
sudo apt install build-essential
```
2. Install nodejs-whisper with npm
```bash
npm i nodejs-whisper
```
3. Download whisper model
```bash
npx nodejs-whisper download
```
- NOTE: user may need to install make tool
### Windows Installation
1. Install MinGW-w64 or MSYS2 (which includes make tools)
- Option 1: Install MSYS2 from https://www.msys2.org/
- Option 2: Install MinGW-w64 from https://www.mingw-w64.org/
2. Install nodejs-whisper with npm
```bash
npm i nodejs-whisper
```
3. Download whisper model
```bash
npx nodejs-whisper download
```
- Note: Make sure mingw32-make or make is available in your system PATH.
## Usage/Examples
See `example/index.ts` (can be run with `$ npm run test`)
```javascript
import path from 'path'
import { nodewhisper } from 'nodejs-whisper'
// Need to provide exact path to your audio file.
const filePath = path.resolve(__dirname, 'YourAudioFileName')
await nodewhisper(filePath, {
modelName: 'base.en', //Downloaded models name
autoDownloadModelName: 'base.en', // (optional) auto download a model if model is not present
removeWavFileAfterTranscription: false, // (optional) remove wav file once transcribed
withCuda: false, // (optional) use cuda for faster processing
logger: console, // (optional) Logging instance, defaults to console
whisperOptions: {
outputInCsv: false, // get output result in csv file
outputInJson: false, // get output result in json file
outputInJsonFull: false, // get output result in json file including more information
outputInLrc: false, // get output result in lrc file
outputInSrt: true, // get output result in srt file
outputInText: false, // get output result in txt file
outputInVtt: false, // get output result in vtt file
outputInWords: false, // get output result in wts file for karaoke
translateToEnglish: false, // translate from source language to english
wordTimestamps: false, // word-level timestamps
timestamps_length: 20, // amount of dialogue per timestamp pair
splitOnWord: true, // split on word rather than on token
},
})
// Model list
const MODELS_LIST = [
'tiny',
'tiny.en',
'base',
'base.en',
'small',
'small.en',
'medium',
'medium.en',
'large-v1',
'large',
'large-v3-turbo',
]
```
## Types
```
interface IOptions {
modelName: string
removeWavFileAfterTranscription?: boolean
withCuda?: boolean
autoDownloadModelName?: string
whisperOptions?: WhisperOptions
logger?: Console
}
interface WhisperOptions {
outputInCsv?: boolean
outputInJson?: boolean
outputInJsonFull?: boolean
outputInLrc?: boolean
outputInSrt?: boolean
outputInText?: boolean
outputInVtt?: boolean
outputInWords?: boolean
translateToEnglish?: boolean
timestamps_length?: number
wordTimestamps?: boolean
splitOnWord?: boolean
}
```
## Run locally
Clone the project
```bash
git clone https://github.com/ChetanXpro/nodejs-whisper
```
Go to the project directory
```bash
cd nodejs-whisper
```
Install dependencies
```bash
npm install
```
Start the server
```bash
npm run dev
```
Build project
```bash
npm run build
```
## Made with
- [Whisper OpenAI (using C++ port by: ggerganov)](https://github.com/ggerganov/whisper.cpp)
## Feedback
If you have any feedback, please reach out to us at chetanbaliyan10.com
## Authors
- [](https://www.github.com/chetanXpro)