UNPKG

watson-speech

Version:

IBM Watson Speech to Text and Text to Speech SDK for web browsers.

974 lines (261 loc) 12.8 kB
<!DOCTYPE html> <html lang="en"> <head> <meta charset="utf-8"> <title>JSDoc: Class: WebAudioL16Stream</title> <script src="scripts/prettify/prettify.js"> </script> <script src="scripts/prettify/lang-css.js"> </script> <!--[if lt IE 9]> <script src="//html5shiv.googlecode.com/svn/trunk/html5.js"></script> <![endif]--> <link type="text/css" rel="stylesheet" href="styles/prettify-tomorrow.css"> <link type="text/css" rel="stylesheet" href="styles/jsdoc-default.css"> </head> <body> <div id="main"> <h1 class="page-title">Class: WebAudioL16Stream</h1> <section> <header> <h2>WebAudioL16Stream</h2> </header> <article> <div class="container-overview"> <h4 class="name" id="WebAudioL16Stream"><span class="type-signature"></span>new WebAudioL16Stream<span class="signature">()</span><span class="type-signature"></span></h4> <div class="description"> Transforms Buffers or AudioBuffers into a binary stream of l16 (raw wav) audio, downsampling in the process. The watson speech-to-text service works on 1600khz and internally downsamples audio received at higher samplerates. WebAudio is usually 48000khz, so downsampling here reduces bandwidth usage by 2/3. Format event + stream can be combined with https://www.npmjs.com/package/wav to generate a wav file with a proper header Todo: support multi-channel audio (for use with <audio>/<video> elements) - will require interleaving audio channels </div> <dl class="details"> <dt class="tag-source">Source:</dt> <dd class="tag-source"><ul class="dummy"><li> <a href="speech-to-text_webaudio-l16-stream.js.html">speech-to-text/webaudio-l16-stream.js</a>, <a href="speech-to-text_webaudio-l16-stream.js.html#line19">line 19</a> </li></ul></dd> </dl> </div> <h3 class="subsection-title">Methods</h3> <h4 class="name" id="downsample"><span class="type-signature"></span>downsample<span class="signature">(buffer)</span><span class="type-signature"> &rarr; {Float32Array}</span></h4> <div class="description"> Downsamples WebAudio to 16 kHz. Browsers can downsample WebAudio natively with OfflineAudioContext's but it was designed for non-streaming use and requires a new context for each AudioBuffer. Firefox can handle this, but chrome (v47) crashes after a few minutes. So, we'll do it in JS for now. This really belongs in it's own stream, but there's no way to create new AudioBuffer instances from JS, so its fairly coupled to the wav conversion code. </div> <h5>Parameters:</h5> <table class="params"> <thead> <tr> <th>Name</th> <th>Type</th> <th class="last">Description</th> </tr> </thead> <tbody> <tr> <td class="name"><code>buffer</code></td> <td class="type"> <span class="param-type">AudioBuffer</span> </td> <td class="description last">Microphone/MediaElement audio chunk</td> </tr> </tbody> </table> <dl class="details"> <dt class="tag-source">Source:</dt> <dd class="tag-source"><ul class="dummy"><li> <a href="speech-to-text_webaudio-l16-stream.js.html">speech-to-text/webaudio-l16-stream.js</a>, <a href="speech-to-text_webaudio-l16-stream.js.html#line63">line 63</a> </li></ul></dd> </dl> <h5>Returns:</h5> <div class="param-desc"> 'audio/l16' chunk </div> <dl> <dt> Type </dt> <dd> <span class="param-type">Float32Array</span> </dd> </dl> <h4 class="name" id="floatTo16BitPCM"><span class="type-signature"></span>floatTo16BitPCM<span class="signature">(input)</span><span class="type-signature"> &rarr; {Buffer}</span></h4> <div class="description"> Accepts a Float32Array of audio data and converts it to a Buffer of l16 audio data (raw wav) Explanation for the math: The raw values captured from the Web Audio API are in 32-bit Floating Point, between -1 and 1 (per the specification). The values for 16-bit PCM range between -32768 and +32767 (16-bit signed integer). Filter & combine samples to reduce frequency, then multiply to by 0x7FFF (32767) to convert. Store in little endian. </div> <h5>Parameters:</h5> <table class="params"> <thead> <tr> <th>Name</th> <th>Type</th> <th class="last">Description</th> </tr> </thead> <tbody> <tr> <td class="name"><code>input</code></td> <td class="type"> </td> <td class="description last"></td> </tr> </tbody> </table> <dl class="details"> <dt class="tag-source">Source:</dt> <dd class="tag-source"><ul class="dummy"><li> <a href="speech-to-text_webaudio-l16-stream.js.html">speech-to-text/webaudio-l16-stream.js</a>, <a href="speech-to-text_webaudio-l16-stream.js.html#line126">line 126</a> </li></ul></dd> </dl> <h5>Returns:</h5> <dl> <dt> Type </dt> <dd> <span class="param-type">Buffer</span> </dd> </dl> <h4 class="name" id="handleFirstAudioBuffer"><span class="type-signature"></span>handleFirstAudioBuffer<span class="signature">(audioBuffer, encoding, next)</span><span class="type-signature"></span></h4> <div class="description"> Does some one-time setup to grab sampleRate and emit format, then sets _transform to the actual audio buffer handler and calls it. </div> <h5>Parameters:</h5> <table class="params"> <thead> <tr> <th>Name</th> <th>Type</th> <th class="last">Description</th> </tr> </thead> <tbody> <tr> <td class="name"><code>audioBuffer</code></td> <td class="type"> </td> <td class="description last"></td> </tr> <tr> <td class="name"><code>encoding</code></td> <td class="type"> </td> <td class="description last"></td> </tr> <tr> <td class="name"><code>next</code></td> <td class="type"> </td> <td class="description last"></td> </tr> </tbody> </table> <dl class="details"> <dt class="tag-source">Source:</dt> <dd class="tag-source"><ul class="dummy"><li> <a href="speech-to-text_webaudio-l16-stream.js.html">speech-to-text/webaudio-l16-stream.js</a>, <a href="speech-to-text_webaudio-l16-stream.js.html#line141">line 141</a> </li></ul></dd> </dl> <h4 class="name" id="transformAudioBuffer"><span class="type-signature"></span>transformAudioBuffer<span class="signature">(audioBuffer, encoding, next)</span><span class="type-signature"></span></h4> <div class="description"> Accepts an AudioBuffer (for objectMode), then downsamples to 16000 and converts to a 16-bit pcm </div> <h5>Parameters:</h5> <table class="params"> <thead> <tr> <th>Name</th> <th>Type</th> <th class="last">Description</th> </tr> </thead> <tbody> <tr> <td class="name"><code>audioBuffer</code></td> <td class="type"> </td> <td class="description last"></td> </tr> <tr> <td class="name"><code>encoding</code></td> <td class="type"> </td> <td class="description last"></td> </tr> <tr> <td class="name"><code>next</code></td> <td class="type"> </td> <td class="description last"></td> </tr> </tbody> </table> <dl class="details"> <dt class="tag-source">Source:</dt> <dd class="tag-source"><ul class="dummy"><li> <a href="speech-to-text_webaudio-l16-stream.js.html">speech-to-text/webaudio-l16-stream.js</a>, <a href="speech-to-text_webaudio-l16-stream.js.html#line155">line 155</a> </li></ul></dd> </dl> <h4 class="name" id="transformBuffer"><span class="type-signature"></span>transformBuffer<span class="signature">(nodebufferok, encoding, next)</span><span class="type-signature"></span></h4> <div class="description"> Accepts a Buffer (for binary mode), then downsamples to 16000 and converts to a 16-bit pcm </div> <h5>Parameters:</h5> <table class="params"> <thead> <tr> <th>Name</th> <th>Type</th> <th class="last">Description</th> </tr> </thead> <tbody> <tr> <td class="name"><code>nodebufferok</code></td> <td class="type"> <span class="param-type">Buffer</span> </td> <td class="description last"></td> </tr> <tr> <td class="name"><code>encoding</code></td> <td class="type"> </td> <td class="description last"></td> </tr> <tr> <td class="name"><code>next</code></td> <td class="type"> </td> <td class="description last"></td> </tr> </tbody> </table> <dl class="details"> <dt class="tag-source">Source:</dt> <dd class="tag-source"><ul class="dummy"><li> <a href="speech-to-text_webaudio-l16-stream.js.html">speech-to-text/webaudio-l16-stream.js</a>, <a href="speech-to-text_webaudio-l16-stream.js.html#line171">line 171</a> </li></ul></dd> </dl> </article> </section> </div> <nav> <h2><a href="index.html">Home</a></h2><h3>Classes</h3><ul><li><a href="FormatStream.html">FormatStream</a></li><li><a href="MediaElementAudioStream.html">MediaElementAudioStream</a></li><li><a href="RecognizeStream.html">RecognizeStream</a></li><li><a href="TimingStream.html">TimingStream</a></li><li><a href="WebAudioL16Stream.html">WebAudioL16Stream</a></li></ul><h3>Events</h3><ul><li><a href="RecognizeStream.html#event:connection-close">connection-close</a></li><li><a href="RecognizeStream.html#event:data">data</a></li><li><a href="RecognizeStream.html#event:error">error</a></li><li><a href="RecognizeStream.html#event:results">results</a></li></ul><h3>Namespaces</h3><ul><li><a href="WatsonSpeech.html">WatsonSpeech</a></li></ul><h3>Global</h3><ul><li><a href="global.html#SpeechToText">SpeechToText</a></li><li><a href="global.html#version">version</a></li></ul> </nav> <br class="clear"> <footer> Documentation generated by <a href="https://github.com/jsdoc3/jsdoc">JSDoc 3.4.0</a> on Mon Feb 15 2016 23:15:22 GMT+0000 (UTC) </footer> <script> prettyPrint(); </script> <script src="scripts/linenumber.js"> </script> </body> </html>