UNPKG

watson-speech

Version:

IBM Watson Speech to Text and Text to Speech SDK for web browsers.

968 lines (261 loc) 13.7 kB
<!DOCTYPE html> <html lang="en"> <head> <meta charset="utf-8"> <title>JSDoc: Class: WebAudioL16Stream</title> <script src="scripts/prettify/prettify.js"> </script> <script src="scripts/prettify/lang-css.js"> </script> <!--[if lt IE 9]> <script src="//html5shiv.googlecode.com/svn/trunk/html5.js"></script> <![endif]--> <link type="text/css" rel="stylesheet" href="styles/prettify-tomorrow.css"> <link type="text/css" rel="stylesheet" href="styles/jsdoc-default.css"> </head> <body> <div id="main"> <h1 class="page-title">Class: WebAudioL16Stream</h1> <section> <header> <h2>WebAudioL16Stream</h2> </header> <article> <div class="container-overview"> <h4 class="name" id="WebAudioL16Stream"><span class="type-signature"></span>new WebAudioL16Stream<span class="signature">()</span><span class="type-signature"></span></h4> <div class="description"> <p>Transforms Buffers or AudioBuffers into a binary stream of l16 (raw wav) audio, downsampling in the process.</p> <p>The watson speech-to-text service works on 1600khz and internally downsamples audio received at higher samplerates. WebAudio is usually 48000khz, so downsampling here reduces bandwidth usage by 2/3.</p> <p>Format event + stream can be combined with https://www.npmjs.com/package/wav to generate a wav file with a proper header</p> <p>Todo: support multi-channel audio (for use with <audio>/<video> elements) - will require interleaving audio channels</p> </div> <dl class="details"> <dt class="tag-source">Source:</dt> <dd class="tag-source"><ul class="dummy"><li> <a href="speech-to-text_webaudio-l16-stream.js.html">speech-to-text/webaudio-l16-stream.js</a>, <a href="speech-to-text_webaudio-l16-stream.js.html#line19">line 19</a> </li></ul></dd> </dl> </div> <h3 class="subsection-title">Methods</h3> <h4 class="name" id="downsample"><span class="type-signature"></span>downsample<span class="signature">(buffer)</span><span class="type-signature"> &rarr; {Float32Array}</span></h4> <div class="description"> <p>Downsamples WebAudio to 16 kHz.</p> <p>Browsers can downsample WebAudio natively with OfflineAudioContext's but it was designed for non-streaming use and requires a new context for each AudioBuffer. Firefox can handle this, but chrome (v47) crashes after a few minutes. So, we'll do it in JS for now.</p> <p>This really belongs in it's own stream, but there's no way to create new AudioBuffer instances from JS, so its fairly coupled to the wav conversion code.</p> </div> <h5>Parameters:</h5> <table class="params"> <thead> <tr> <th>Name</th> <th>Type</th> <th class="last">Description</th> </tr> </thead> <tbody> <tr> <td class="name"><code>buffer</code></td> <td class="type"> <span class="param-type">AudioBuffer</span> </td> <td class="description last"><p>Microphone/MediaElement audio chunk</p></td> </tr> </tbody> </table> <dl class="details"> <dt class="tag-source">Source:</dt> <dd class="tag-source"><ul class="dummy"><li> <a href="speech-to-text_webaudio-l16-stream.js.html">speech-to-text/webaudio-l16-stream.js</a>, <a href="speech-to-text_webaudio-l16-stream.js.html#line63">line 63</a> </li></ul></dd> </dl> <h5>Returns:</h5> <div class="param-desc"> <p>'audio/l16' chunk</p> </div> <dl> <dt> Type </dt> <dd> <span class="param-type">Float32Array</span> </dd> </dl> <h4 class="name" id="floatTo16BitPCM"><span class="type-signature"></span>floatTo16BitPCM<span class="signature">(input)</span><span class="type-signature"> &rarr; {Buffer}</span></h4> <div class="description"> <p>Accepts a Float32Array of audio data and converts it to a Buffer of l16 audio data (raw wav)</p> <p>Explanation for the math: The raw values captured from the Web Audio API are in 32-bit Floating Point, between -1 and 1 (per the specification). The values for 16-bit PCM range between -32768 and +32767 (16-bit signed integer). Filter &amp; combine samples to reduce frequency, then multiply to by 0x7FFF (32767) to convert. Store in little endian.</p> </div> <h5>Parameters:</h5> <table class="params"> <thead> <tr> <th>Name</th> <th>Type</th> <th class="last">Description</th> </tr> </thead> <tbody> <tr> <td class="name"><code>input</code></td> <td class="type"> </td> <td class="description last"></td> </tr> </tbody> </table> <dl class="details"> <dt class="tag-source">Source:</dt> <dd class="tag-source"><ul class="dummy"><li> <a href="speech-to-text_webaudio-l16-stream.js.html">speech-to-text/webaudio-l16-stream.js</a>, <a href="speech-to-text_webaudio-l16-stream.js.html#line126">line 126</a> </li></ul></dd> </dl> <h5>Returns:</h5> <dl> <dt> Type </dt> <dd> <span class="param-type">Buffer</span> </dd> </dl> <h4 class="name" id="handleFirstAudioBuffer"><span class="type-signature"></span>handleFirstAudioBuffer<span class="signature">(audioBuffer, encoding, next)</span><span class="type-signature"></span></h4> <div class="description"> <p>Does some one-time setup to grab sampleRate and emit format, then sets _transform to the actual audio buffer handler and calls it.</p> </div> <h5>Parameters:</h5> <table class="params"> <thead> <tr> <th>Name</th> <th>Type</th> <th class="last">Description</th> </tr> </thead> <tbody> <tr> <td class="name"><code>audioBuffer</code></td> <td class="type"> </td> <td class="description last"></td> </tr> <tr> <td class="name"><code>encoding</code></td> <td class="type"> </td> <td class="description last"></td> </tr> <tr> <td class="name"><code>next</code></td> <td class="type"> </td> <td class="description last"></td> </tr> </tbody> </table> <dl class="details"> <dt class="tag-source">Source:</dt> <dd class="tag-source"><ul class="dummy"><li> <a href="speech-to-text_webaudio-l16-stream.js.html">speech-to-text/webaudio-l16-stream.js</a>, <a href="speech-to-text_webaudio-l16-stream.js.html#line141">line 141</a> </li></ul></dd> </dl> <h4 class="name" id="transformAudioBuffer"><span class="type-signature"></span>transformAudioBuffer<span class="signature">(audioBuffer, encoding, next)</span><span class="type-signature"></span></h4> <div class="description"> <p>Accepts an AudioBuffer (for objectMode), then downsamples to 16000 and converts to a 16-bit pcm</p> </div> <h5>Parameters:</h5> <table class="params"> <thead> <tr> <th>Name</th> <th>Type</th> <th class="last">Description</th> </tr> </thead> <tbody> <tr> <td class="name"><code>audioBuffer</code></td> <td class="type"> </td> <td class="description last"></td> </tr> <tr> <td class="name"><code>encoding</code></td> <td class="type"> </td> <td class="description last"></td> </tr> <tr> <td class="name"><code>next</code></td> <td class="type"> </td> <td class="description last"></td> </tr> </tbody> </table> <dl class="details"> <dt class="tag-source">Source:</dt> <dd class="tag-source"><ul class="dummy"><li> <a href="speech-to-text_webaudio-l16-stream.js.html">speech-to-text/webaudio-l16-stream.js</a>, <a href="speech-to-text_webaudio-l16-stream.js.html#line155">line 155</a> </li></ul></dd> </dl> <h4 class="name" id="transformBuffer"><span class="type-signature"></span>transformBuffer<span class="signature">(nodebufferok, encoding, next)</span><span class="type-signature"></span></h4> <div class="description"> <p>Accepts a Buffer (for binary mode), then downsamples to 16000 and converts to a 16-bit pcm</p> </div> <h5>Parameters:</h5> <table class="params"> <thead> <tr> <th>Name</th> <th>Type</th> <th class="last">Description</th> </tr> </thead> <tbody> <tr> <td class="name"><code>nodebufferok</code></td> <td class="type"> <span class="param-type">Buffer</span> </td> <td class="description last"></td> </tr> <tr> <td class="name"><code>encoding</code></td> <td class="type"> </td> <td class="description last"></td> </tr> <tr> <td class="name"><code>next</code></td> <td class="type"> </td> <td class="description last"></td> </tr> </tbody> </table> <dl class="details"> <dt class="tag-source">Source:</dt> <dd class="tag-source"><ul class="dummy"><li> <a href="speech-to-text_webaudio-l16-stream.js.html">speech-to-text/webaudio-l16-stream.js</a>, <a href="speech-to-text_webaudio-l16-stream.js.html#line171">line 171</a> </li></ul></dd> </dl> </article> </section> </div> <nav> <h2><a href="index.html">Home</a></h2><h3>Modules</h3><ul><li><a href="module-watson-speech.html">watson-speech</a></li><li><a href="module-watson-speech_speech-to-text.html">watson-speech/speech-to-text</a></li><li><a href="module-watson-speech_speech-to-text_recognize-element.html">watson-speech/speech-to-text/recognize-element</a></li><li><a href="module-watson-speech_speech-to-text_recognize-file.html">watson-speech/speech-to-text/recognize-file</a></li><li><a href="module-watson-speech_speech-to-text_recognize-microphone.html">watson-speech/speech-to-text/recognize-microphone</a></li><li><a href="module-watson-speech_text-to-speech.html">watson-speech/text-to-speech</a></li><li><a href="module-watson-speech_text-to-speech_get-voices.html">watson-speech/text-to-speech/get-voices</a></li><li><a href="module-watson-speech_text-to-speech_synthesize.html">watson-speech/text-to-speech/synthesize</a></li></ul><h3>Classes</h3><ul><li><a href="FilePlayer.html">FilePlayer</a></li><li><a href="FormatStream.html">FormatStream</a></li><li><a href="MediaElementAudioStream.html">MediaElementAudioStream</a></li><li><a href="RecognizeStream.html">RecognizeStream</a></li><li><a href="TimingStream.html">TimingStream</a></li><li><a href="WebAudioL16Stream.html">WebAudioL16Stream</a></li><li><a href="WritableElementStream.html">WritableElementStream</a></li></ul><h3>Events</h3><ul><li><a href="RecognizeStream.html#event:close">close</a></li><li><a href="RecognizeStream.html#event:connection-close">connection-close</a></li><li><a href="RecognizeStream.html#event:data">data</a></li><li><a href="RecognizeStream.html#event:error">error</a></li><li><a href="RecognizeStream.html#event:results">results</a></li></ul> </nav> <br class="clear"> <footer> Documentation generated by <a href="https://github.com/jsdoc3/jsdoc">JSDoc 3.4.0</a> on Tue Feb 23 2016 22:23:53 GMT+0000 (UTC) </footer> <script> prettyPrint(); </script> <script src="scripts/linenumber.js"> </script> </body> </html>