watson-speech
Version:
IBM Watson Speech to Text and Text to Speech SDK for web browsers.
1,062 lines (289 loc) • 15.5 kB
HTML
<html lang="en">
<head>
<meta charset="utf-8">
<title>JSDoc: Class: WebAudioL16Stream</title>
<script src="scripts/prettify/prettify.js"> </script>
<script src="scripts/prettify/lang-css.js"> </script>
<!--[if lt IE 9]>
<script src="//html5shiv.googlecode.com/svn/trunk/html5.js"></script>
<![endif]-->
<link type="text/css" rel="stylesheet" href="styles/prettify-tomorrow.css">
<link type="text/css" rel="stylesheet" href="styles/jsdoc-default.css">
</head>
<body>
<div id="main">
<h1 class="page-title">Class: WebAudioL16Stream</h1>
<section>
<header>
<h2>WebAudioL16Stream</h2>
</header>
<article>
<div class="container-overview">
<h4 class="name" id="WebAudioL16Stream"><span class="type-signature"></span>new WebAudioL16Stream<span class="signature">(options)</span><span class="type-signature"></span></h4>
<div class="description">
<p>Transforms Buffers or AudioBuffers into a binary stream of l16 (raw wav) audio, downsampling in the process.</p>
<p>The watson speech-to-text service works on 16kHz and internally downsamples audio received at higher samplerates.
WebAudio is usually 44.1kHz or 48kHz, so downsampling here reduces bandwidth usage by ~2/3.</p>
<p>Format event + stream can be combined with https://www.npmjs.com/package/wav to generate a wav file with a proper header</p>
<p>Todo: support multi-channel audio (for use with <audio>/<video> elements) - will require interleaving audio channels</p>
</div>
<h5>Parameters:</h5>
<table class="params">
<thead>
<tr>
<th>Name</th>
<th>Type</th>
<th class="last">Description</th>
</tr>
</thead>
<tbody>
<tr>
<td class="name"><code>options</code></td>
<td class="type">
<span class="param-type">Object</span>
</td>
<td class="description last"></td>
</tr>
</tbody>
</table>
<dl class="details">
<dt class="tag-source">Source:</dt>
<dd class="tag-source"><ul class="dummy"><li>
<a href="speech-to-text_webaudio-l16-stream.js.html">speech-to-text/webaudio-l16-stream.js</a>, <a href="speech-to-text_webaudio-l16-stream.js.html#line20">line 20</a>
</li></ul></dd>
</dl>
</div>
<h3 class="subsection-title">Methods</h3>
<h4 class="name" id="downsample"><span class="type-signature"></span>downsample<span class="signature">(bufferNewSamples)</span><span class="type-signature"> → {Float32Array}</span></h4>
<div class="description">
<p>Downsamples WebAudio to 16 kHz.</p>
<p>Browsers can downsample WebAudio natively with OfflineAudioContext's but it was designed for non-streaming use and
requires a new context for each AudioBuffer. Firefox can handle this, but chrome (v47) crashes after a few minutes.
So, we'll do it in JS for now.</p>
<p>This really belongs in it's own stream, but there's no way to create new AudioBuffer instances from JS, so its
fairly coupled to the wav conversion code.</p>
</div>
<h5>Parameters:</h5>
<table class="params">
<thead>
<tr>
<th>Name</th>
<th>Type</th>
<th class="last">Description</th>
</tr>
</thead>
<tbody>
<tr>
<td class="name"><code>bufferNewSamples</code></td>
<td class="type">
<span class="param-type">AudioBuffer</span>
</td>
<td class="description last"><p>Microphone/MediaElement audio chunk</p></td>
</tr>
</tbody>
</table>
<dl class="details">
<dt class="tag-source">Source:</dt>
<dd class="tag-source"><ul class="dummy"><li>
<a href="speech-to-text_webaudio-l16-stream.js.html">speech-to-text/webaudio-l16-stream.js</a>, <a href="speech-to-text_webaudio-l16-stream.js.html#line62">line 62</a>
</li></ul></dd>
</dl>
<h5>Returns:</h5>
<div class="param-desc">
<p>'audio/l16' chunk</p>
</div>
<dl>
<dt>
Type
</dt>
<dd>
<span class="param-type">Float32Array</span>
</dd>
</dl>
<h4 class="name" id="floatTo16BitPCM"><span class="type-signature"></span>floatTo16BitPCM<span class="signature">(input)</span><span class="type-signature"> → {Buffer}</span></h4>
<div class="description">
<p>Accepts a Float32Array of audio data and converts it to a Buffer of l16 audio data (raw wav)</p>
<p>Explanation for the math: The raw values captured from the Web Audio API are
in 32-bit Floating Point, between -1 and 1 (per the specification).
The values for 16-bit PCM range between -32768 and +32767 (16-bit signed integer).
Filter & combine samples to reduce frequency, then multiply to by 0x7FFF (32767) to convert.
Store in little endian.</p>
</div>
<h5>Parameters:</h5>
<table class="params">
<thead>
<tr>
<th>Name</th>
<th>Type</th>
<th class="last">Description</th>
</tr>
</thead>
<tbody>
<tr>
<td class="name"><code>input</code></td>
<td class="type">
<span class="param-type">Float32Array</span>
</td>
<td class="description last"></td>
</tr>
</tbody>
</table>
<dl class="details">
<dt class="tag-source">Source:</dt>
<dd class="tag-source"><ul class="dummy"><li>
<a href="speech-to-text_webaudio-l16-stream.js.html">speech-to-text/webaudio-l16-stream.js</a>, <a href="speech-to-text_webaudio-l16-stream.js.html#line150">line 150</a>
</li></ul></dd>
</dl>
<h5>Returns:</h5>
<dl>
<dt>
Type
</dt>
<dd>
<span class="param-type">Buffer</span>
</dd>
</dl>
<h4 class="name" id="handleFirstAudioBuffer"><span class="type-signature"></span>handleFirstAudioBuffer<span class="signature">(audioBuffer, encoding, next)</span><span class="type-signature"></span></h4>
<div class="description">
<p>Does some one-time setup to grab sampleRate and emit format, then sets _transform to the actual audio buffer handler and calls it.</p>
</div>
<h5>Parameters:</h5>
<table class="params">
<thead>
<tr>
<th>Name</th>
<th>Type</th>
<th class="last">Description</th>
</tr>
</thead>
<tbody>
<tr>
<td class="name"><code>audioBuffer</code></td>
<td class="type">
<span class="param-type">AudioBuffer</span>
</td>
<td class="description last"></td>
</tr>
<tr>
<td class="name"><code>encoding</code></td>
<td class="type">
<span class="param-type">String</span>
</td>
<td class="description last"></td>
</tr>
<tr>
<td class="name"><code>next</code></td>
<td class="type">
<span class="param-type">function</span>
</td>
<td class="description last"></td>
</tr>
</tbody>
</table>
<dl class="details">
<dt class="tag-source">Source:</dt>
<dd class="tag-source"><ul class="dummy"><li>
<a href="speech-to-text_webaudio-l16-stream.js.html">speech-to-text/webaudio-l16-stream.js</a>, <a href="speech-to-text_webaudio-l16-stream.js.html#line165">line 165</a>
</li></ul></dd>
</dl>
<h4 class="name" id="transformAudioBuffer"><span class="type-signature"></span>transformAudioBuffer<span class="signature">(audioBuffer, encoding, next)</span><span class="type-signature"></span></h4>
<div class="description">
<p>Accepts an AudioBuffer (for objectMode), then downsamples to 16000 and converts to a 16-bit pcm</p>
</div>
<h5>Parameters:</h5>
<table class="params">
<thead>
<tr>
<th>Name</th>
<th>Type</th>
<th class="last">Description</th>
</tr>
</thead>
<tbody>
<tr>
<td class="name"><code>audioBuffer</code></td>
<td class="type">
<span class="param-type">AudioBuffer</span>
</td>
<td class="description last"></td>
</tr>
<tr>
<td class="name"><code>encoding</code></td>
<td class="type">
<span class="param-type">String</span>
</td>
<td class="description last"></td>
</tr>
<tr>
<td class="name"><code>next</code></td>
<td class="type">
<span class="param-type">function</span>
</td>
<td class="description last"></td>
</tr>
</tbody>
</table>
<dl class="details">
<dt class="tag-source">Source:</dt>
<dd class="tag-source"><ul class="dummy"><li>
<a href="speech-to-text_webaudio-l16-stream.js.html">speech-to-text/webaudio-l16-stream.js</a>, <a href="speech-to-text_webaudio-l16-stream.js.html#line179">line 179</a>
</li></ul></dd>
</dl>
<h4 class="name" id="transformBuffer"><span class="type-signature"></span>transformBuffer<span class="signature">(nodebuffer, encoding, next)</span><span class="type-signature"></span></h4>
<div class="description">
<p>Accepts a Buffer (for binary mode), then downsamples to 16000 and converts to a 16-bit pcm</p>
</div>
<h5>Parameters:</h5>
<table class="params">
<thead>
<tr>
<th>Name</th>
<th>Type</th>
<th class="last">Description</th>
</tr>
</thead>
<tbody>
<tr>
<td class="name"><code>nodebuffer</code></td>
<td class="type">
<span class="param-type">Buffer</span>
</td>
<td class="description last"></td>
</tr>
<tr>
<td class="name"><code>encoding</code></td>
<td class="type">
<span class="param-type">String</span>
</td>
<td class="description last"></td>
</tr>
<tr>
<td class="name"><code>next</code></td>
<td class="type">
<span class="param-type">function</span>
</td>
<td class="description last"></td>
</tr>
</tbody>
</table>
<dl class="details">
<dt class="tag-source">Source:</dt>
<dd class="tag-source"><ul class="dummy"><li>
<a href="speech-to-text_webaudio-l16-stream.js.html">speech-to-text/webaudio-l16-stream.js</a>, <a href="speech-to-text_webaudio-l16-stream.js.html#line195">line 195</a>
</li></ul></dd>
</dl>
</article>
</section>
</div>
<nav>
<h2><a href="index.html">Home</a></h2><h3>Modules</h3><ul><li><a href="module-watson-speech.html">watson-speech</a></li><li><a href="module-watson-speech_speech-to-text.html">watson-speech/speech-to-text</a></li><li><a href="module-watson-speech_speech-to-text_get-models.html">watson-speech/speech-to-text/get-models</a></li><li><a href="module-watson-speech_speech-to-text_recognize-file.html">watson-speech/speech-to-text/recognize-file</a></li><li><a href="module-watson-speech_speech-to-text_recognize-microphone.html">watson-speech/speech-to-text/recognize-microphone</a></li><li><a href="module-watson-speech_text-to-speech.html">watson-speech/text-to-speech</a></li><li><a href="module-watson-speech_text-to-speech_get-voices.html">watson-speech/text-to-speech/get-voices</a></li><li><a href="module-watson-speech_text-to-speech_synthesize.html">watson-speech/text-to-speech/synthesize</a></li></ul><h3>Classes</h3><ul><li><a href="FilePlayer.html">FilePlayer</a></li><li><a href="FormatStream.html">FormatStream</a></li><li><a href="RecognizeStream.html">RecognizeStream</a></li><li><a href="ResultStream.html">ResultStream</a></li><li><a href="SpeakerStream.html">SpeakerStream</a></li><li><a href="TimingStream.html">TimingStream</a></li><li><a href="UrlPlayer.html">UrlPlayer</a></li><li><a href="WebAudioL16Stream.html">WebAudioL16Stream</a></li><li><a href="WritableElementStream.html">WritableElementStream</a></li></ul><h3>Events</h3><ul><li><a href="RecognizeStream.html#event:close">close</a></li><li><a href="RecognizeStream.html#event:data">data</a></li><li><a href="RecognizeStream.html#event:error">error</a></li><li><a href="RecognizeStream.html#event:listening">listening</a></li><li><a href="RecognizeStream.html#event:message">message</a></li><li><a href="RecognizeStream.html#event:open">open</a></li><li><a href="RecognizeStream.html#event:send-data">send-data</a></li><li><a href="RecognizeStream.html#event:send-json">send-json</a></li><li><a href="RecognizeStream.html#event:stop">stop</a></li><li><a href="SpeakerStream.html#event:data">data</a></li></ul><h3>Global</h3><ul><li><a href="global.html#getContentTypeFromFile">getContentTypeFromFile</a></li><li><a href="global.html#playFile">playFile</a></li></ul>
</nav>
<br class="clear">
<footer>
Documentation generated by <a href="https://github.com/jsdoc3/jsdoc">JSDoc 3.4.3</a> on Tue Feb 21 2017 17:41:51 GMT+0000 (UTC)
</footer>
<script> prettyPrint(); </script>
<script src="scripts/linenumber.js"> </script>
</body>
</html>