UNPKG

watson-speech

Version:

IBM Watson Speech to Text and Text to Speech SDK for web browsers.

2,401 lines (622 loc) 34.2 kB
<!DOCTYPE html> <html lang="en"> <head> <meta charset="utf-8"> <title>JSDoc: Class: RecognizeStream</title> <script src="scripts/prettify/prettify.js"> </script> <script src="scripts/prettify/lang-css.js"> </script> <!--[if lt IE 9]> <script src="//html5shiv.googlecode.com/svn/trunk/html5.js"></script> <![endif]--> <link type="text/css" rel="stylesheet" href="styles/prettify-tomorrow.css"> <link type="text/css" rel="stylesheet" href="styles/jsdoc-default.css"> </head> <body> <div id="main"> <h1 class="page-title">Class: RecognizeStream</h1> <section> <header> <h2>RecognizeStream</h2> </header> <article> <div class="container-overview"> <h4 class="name" id="RecognizeStream"><span class="type-signature"></span>new RecognizeStream<span class="signature">(options)</span><span class="type-signature"></span></h4> <div class="description"> <p>pipe()-able Node.js Duplex stream - accepts binary audio and emits text/objects in it's <code>data</code> events.</p> <p>Uses WebSockets under the hood. For audio with no recognizable speech, no <code>data</code> events are emitted.</p> <p>By default, only finalized text is emitted in the data events, however when <code>objectMode</code>/<code>readableObjectMode</code> and <code>interim_results</code> are enabled, both interim and final results objects are emitted. WriteableElementStream uses this, for example, to live-update the DOM with word-by-word transcriptions.</p> <p>Note that the WebSocket connection is not established until the first chunk of data is recieved. This allows for auto-detection of content type (for wav/flac/opus audio).</p> </div> <h5>Parameters:</h5> <table class="params"> <thead> <tr> <th>Name</th> <th>Type</th> <th class="last">Description</th> </tr> </thead> <tbody> <tr> <td class="name"><code>options</code></td> <td class="type"> <span class="param-type">Object</span> </td> <td class="description last"> <h6>Properties</h6> <table class="params"> <thead> <tr> <th>Name</th> <th>Type</th> <th>Attributes</th> <th>Default</th> <th class="last">Description</th> </tr> </thead> <tbody> <tr> <td class="name"><code>model</code></td> <td class="type"> <span class="param-type">String</span> </td> <td class="attributes"> &lt;optional><br> </td> <td class="default"> 'en-US_BroadbandModel' </td> <td class="description last"><p>voice model to use. Microphone streaming only supports broadband models.</p></td> </tr> <tr> <td class="name"><code>url</code></td> <td class="type"> <span class="param-type">String</span> </td> <td class="attributes"> &lt;optional><br> </td> <td class="default"> 'wss://stream.watsonplatform.net/speech-to-text/api' </td> <td class="description last"><p>base URL for service</p></td> </tr> <tr> <td class="name"><code>token</code></td> <td class="type"> <span class="param-type">String</span> </td> <td class="attributes"> &lt;optional><br> </td> <td class="default"> </td> <td class="description last"><p>Auth token</p></td> </tr> <tr> <td class="name"><code>headers</code></td> <td class="type"> <span class="param-type">Object</span> </td> <td class="attributes"> &lt;optional><br> </td> <td class="default"> </td> <td class="description last"><p>Only works in Node.js, not in browsers. Allows for custom headers to be set, including an Authorization header (preventing the need for auth tokens)</p></td> </tr> <tr> <td class="name"><code>content-type</code></td> <td class="type"> <span class="param-type">String</span> </td> <td class="attributes"> &lt;optional><br> </td> <td class="default"> 'audio/wav' </td> <td class="description last"><p>content type of audio; can be automatically determined from file header in most cases. only wav, flac, and ogg/opus are supported</p></td> </tr> <tr> <td class="name"><code>interim_results</code></td> <td class="type"> <span class="param-type">Boolean</span> </td> <td class="attributes"> &lt;optional><br> </td> <td class="default"> true </td> <td class="description last"><p>Send back non-final previews of each &quot;sentence&quot; as it is being processed. These results are ignored in text mode.</p></td> </tr> <tr> <td class="name"><code>continuous</code></td> <td class="type"> <span class="param-type">Boolean</span> </td> <td class="attributes"> &lt;optional><br> </td> <td class="default"> true </td> <td class="description last"><p>set to false to automatically stop the transcription after the first &quot;sentence&quot;</p></td> </tr> <tr> <td class="name"><code>word_confidence</code></td> <td class="type"> <span class="param-type">Boolean</span> </td> <td class="attributes"> &lt;optional><br> </td> <td class="default"> false </td> <td class="description last"><p>include confidence scores with results. Defaults to true when in objectMode.</p></td> </tr> <tr> <td class="name"><code>timestamps</code></td> <td class="type"> <span class="param-type">Boolean</span> </td> <td class="attributes"> &lt;optional><br> </td> <td class="default"> false </td> <td class="description last"><p>include timestamps with results. Defaults to true when in objectMode.</p></td> </tr> <tr> <td class="name"><code>max_alternatives</code></td> <td class="type"> <span class="param-type">Number</span> </td> <td class="attributes"> &lt;optional><br> </td> <td class="default"> 1 </td> <td class="description last"><p>maximum number of alternative transcriptions to include. Defaults to 3 when in objectMode.</p></td> </tr> <tr> <td class="name"><code>keywords</code></td> <td class="type"> <span class="param-type">Array.&lt;String></span> </td> <td class="attributes"> &lt;optional><br> </td> <td class="default"> </td> <td class="description last"><p>a list of keywords to search for in the audio</p></td> </tr> <tr> <td class="name"><code>keywords_threshold</code></td> <td class="type"> <span class="param-type">Number</span> </td> <td class="attributes"> &lt;optional><br> </td> <td class="default"> </td> <td class="description last"><p>Number between 0 and 1 representing the minimum confidence before including a keyword in the results. Required when options.keywords is set</p></td> </tr> <tr> <td class="name"><code>word_alternatives_threshold</code></td> <td class="type"> <span class="param-type">Number</span> </td> <td class="attributes"> &lt;optional><br> </td> <td class="default"> </td> <td class="description last"><p>Number between 0 and 1 representing the minimum confidence before including an alternative word in the results. Must be set to enable word alternatives,</p></td> </tr> <tr> <td class="name"><code>profanity_filter</code></td> <td class="type"> <span class="param-type">Boolean</span> </td> <td class="attributes"> &lt;optional><br> </td> <td class="default"> false </td> <td class="description last"><p>set to true to filter out profanity and replace the words with *'s</p></td> </tr> <tr> <td class="name"><code>inactivity_timeout</code></td> <td class="type"> <span class="param-type">Number</span> </td> <td class="attributes"> &lt;optional><br> </td> <td class="default"> 30 </td> <td class="description last"><p>how many seconds of silence before automatically closing the stream (even if continuous is true). use -1 for infinity</p></td> </tr> <tr> <td class="name"><code>readableObjectMode</code></td> <td class="type"> <span class="param-type">Boolean</span> </td> <td class="attributes"> &lt;optional><br> </td> <td class="default"> false </td> <td class="description last"><p>emit <code>result</code> objects instead of string Buffers for the <code>data</code> events. Does not affect input (which must be binary)</p></td> </tr> <tr> <td class="name"><code>objectMode</code></td> <td class="type"> <span class="param-type">Boolean</span> </td> <td class="attributes"> &lt;optional><br> </td> <td class="default"> false </td> <td class="description last"><p>alias for options.readableObjectMode</p></td> </tr> <tr> <td class="name"><code>X-Watson-Learning-Opt-Out</code></td> <td class="type"> <span class="param-type">Number</span> </td> <td class="attributes"> &lt;optional><br> </td> <td class="default"> false </td> <td class="description last"><p>set to true to opt-out of allowing Watson to use this request to improve it's services</p></td> </tr> <tr> <td class="name"><code>smart_formatting</code></td> <td class="type"> <span class="param-type">Boolean</span> </td> <td class="attributes"> &lt;optional><br> </td> <td class="default"> false </td> <td class="description last"><p>formats numeric values such as dates, times, currency, etc.</p></td> </tr> <tr> <td class="name"><code>customization_id</code></td> <td class="type"> <span class="param-type">String</span> </td> <td class="attributes"> &lt;optional><br> </td> <td class="default"> </td> <td class="description last"><p>not yet supported on the public STT service</p></td> </tr> </tbody> </table> </td> </tr> </tbody> </table> <dl class="details"> <dt class="tag-source">Source:</dt> <dd class="tag-source"><ul class="dummy"><li> <a href="speech-to-text_recognize-stream.js.html">speech-to-text/recognize-stream.js</a>, <a href="speech-to-text_recognize-stream.js.html#line78">line 78</a> </li></ul></dd> </dl> </div> <h3 class="subsection-title">Methods</h3> <h4 class="name" id="stop"><span class="type-signature"></span>stop<span class="signature">()</span><span class="type-signature"></span></h4> <div class="description"> <p>Prevents any more audio from being sent over the WebSocket and gracefully closes the connection. Additional data may still be emitted up until the <code>end</code> event is triggered.</p> </div> <dl class="details"> <dt class="tag-source">Source:</dt> <dd class="tag-source"><ul class="dummy"><li> <a href="speech-to-text_recognize-stream.js.html">speech-to-text/recognize-stream.js</a>, <a href="speech-to-text_recognize-stream.js.html#line363">line 363</a> </li></ul></dd> </dl> <h3 class="subsection-title">Events</h3> <h4 class="name" id="event:close">close</h4> <h5>Parameters:</h5> <table class="params"> <thead> <tr> <th>Name</th> <th>Type</th> <th class="last">Description</th> </tr> </thead> <tbody> <tr> <td class="name"><code>reasonCode</code></td> <td class="type"> <span class="param-type">Number</span> </td> <td class="description last"></td> </tr> <tr> <td class="name"><code>description</code></td> <td class="type"> <span class="param-type">String</span> </td> <td class="description last"></td> </tr> </tbody> </table> <dl class="details"> <dt class="tag-source">Source:</dt> <dd class="tag-source"><ul class="dummy"><li> <a href="speech-to-text_recognize-stream.js.html">speech-to-text/recognize-stream.js</a>, <a href="speech-to-text_recognize-stream.js.html#line190">line 190</a> </li></ul></dd> </dl> <h4 class="name" id="event:data">data</h4> <div class="description"> <p>Finalized text</p> </div> <h5>Parameters:</h5> <table class="params"> <thead> <tr> <th>Name</th> <th>Type</th> <th class="last">Description</th> </tr> </thead> <tbody> <tr> <td class="name"><code>transcript</code></td> <td class="type"> <span class="param-type">String</span> </td> <td class="description last"></td> </tr> </tbody> </table> <dl class="details"> <dt class="tag-source">Source:</dt> <dd class="tag-source"><ul class="dummy"><li> <a href="speech-to-text_recognize-stream.js.html">speech-to-text/recognize-stream.js</a>, <a href="speech-to-text_recognize-stream.js.html#line261">line 261</a> </li></ul></dd> </dl> <h4 class="name" id="event:data">data</h4> <div class="description"> <p>Object with interim or final results, possibly including confidence scores, alternatives, and word timing.</p> </div> <h5>Parameters:</h5> <table class="params"> <thead> <tr> <th>Name</th> <th>Type</th> <th class="last">Description</th> </tr> </thead> <tbody> <tr> <td class="name"><code>data</code></td> <td class="type"> <span class="param-type">Object</span> </td> <td class="description last"></td> </tr> </tbody> </table> <dl class="details"> <dt class="tag-source">Source:</dt> <dd class="tag-source"><ul class="dummy"><li> <a href="speech-to-text_recognize-stream.js.html">speech-to-text/recognize-stream.js</a>, <a href="speech-to-text_recognize-stream.js.html#line252">line 252</a> </li></ul></dd> </dl> <h4 class="name" id="event:error">error</h4> <h5>Parameters:</h5> <table class="params"> <thead> <tr> <th>Name</th> <th>Type</th> <th>Attributes</th> <th class="last">Description</th> </tr> </thead> <tbody> <tr> <td class="name"><code>msg</code></td> <td class="type"> <span class="param-type">String</span> </td> <td class="attributes"> </td> <td class="description last"><p>custom error message</p></td> </tr> <tr> <td class="name"><code>frame</code></td> <td class="type"> <span class="param-type">*</span> </td> <td class="attributes"> &lt;optional><br> </td> <td class="description last"><p>unprocessed frame (should have a .data property with either string or binary data)</p></td> </tr> <tr> <td class="name"><code>err</code></td> <td class="type"> <span class="param-type">Error</span> </td> <td class="attributes"> &lt;optional><br> </td> <td class="description last"></td> </tr> </tbody> </table> <dl class="details"> <dt class="tag-source">Source:</dt> <dd class="tag-source"><ul class="dummy"><li> <a href="speech-to-text_recognize-stream.js.html">speech-to-text/recognize-stream.js</a>, <a href="speech-to-text_recognize-stream.js.html#line198">line 198</a> </li></ul></dd> </dl> <h4 class="name" id="event:listening">listening</h4> <div class="description"> <p>Emitted when the Watson Service indicates readieness to transcribe audio. Any audio sent before this point will be buffered until now.</p> </div> <dl class="details"> <dt class="tag-source">Source:</dt> <dd class="tag-source"><ul class="dummy"><li> <a href="speech-to-text_recognize-stream.js.html">speech-to-text/recognize-stream.js</a>, <a href="speech-to-text_recognize-stream.js.html#line244">line 244</a> </li></ul></dd> </dl> <h4 class="name" id="event:message">message</h4> <div class="description"> <p>Emit any messages received over the wire, mainly used for debugging.</p> </div> <h5>Parameters:</h5> <table class="params"> <thead> <tr> <th>Name</th> <th>Type</th> <th>Attributes</th> <th class="last">Description</th> </tr> </thead> <tbody> <tr> <td class="name"><code>message</code></td> <td class="type"> <span class="param-type">Object</span> </td> <td class="attributes"> </td> <td class="description last"><p>frame object with a data attribute that's either a string or a Buffer/TypedArray</p></td> </tr> <tr> <td class="name"><code>data</code></td> <td class="type"> <span class="param-type">Object</span> </td> <td class="attributes"> &lt;optional><br> </td> <td class="description last"><p>parsed JSON object (if possible);</p></td> </tr> </tbody> </table> <dl class="details"> <dt class="tag-source">Source:</dt> <dd class="tag-source"><ul class="dummy"><li> <a href="speech-to-text_recognize-stream.js.html">speech-to-text/recognize-stream.js</a>, <a href="speech-to-text_recognize-stream.js.html#line226">line 226</a> </li></ul></dd> </dl> <h4 class="name" id="event:open">open</h4> <div class="description"> <p>emitted once the WebSocket connection has been established</p> </div> <dl class="details"> <dt class="tag-source">Source:</dt> <dd class="tag-source"><ul class="dummy"><li> <a href="speech-to-text_recognize-stream.js.html">speech-to-text/recognize-stream.js</a>, <a href="speech-to-text_recognize-stream.js.html#line178">line 178</a> </li></ul></dd> </dl> <h4 class="name" id="event:send-data">send-data</h4> <div class="description"> <p>Emits any Binary object sent to the service from the client. Mainly used for debugging.</p> </div> <h5>Parameters:</h5> <table class="params"> <thead> <tr> <th>Name</th> <th>Type</th> <th class="last">Description</th> </tr> </thead> <tbody> <tr> <td class="name"><code>msg</code></td> <td class="type"> <span class="param-type">Object</span> </td> <td class="description last"></td> </tr> </tbody> </table> <dl class="details"> <dt class="tag-source">Source:</dt> <dd class="tag-source"><ul class="dummy"><li> <a href="speech-to-text_recognize-stream.js.html">speech-to-text/recognize-stream.js</a>, <a href="speech-to-text_recognize-stream.js.html#line287">line 287</a> </li></ul></dd> </dl> <h4 class="name" id="event:send-json">send-json</h4> <div class="description"> <p>Emits any JSON object sent to the service from the client. Mainly used for debugging.</p> </div> <h5>Parameters:</h5> <table class="params"> <thead> <tr> <th>Name</th> <th>Type</th> <th class="last">Description</th> </tr> </thead> <tbody> <tr> <td class="name"><code>msg</code></td> <td class="type"> <span class="param-type">Object</span> </td> <td class="description last"></td> </tr> </tbody> </table> <dl class="details"> <dt class="tag-source">Source:</dt> <dd class="tag-source"><ul class="dummy"><li> <a href="speech-to-text_recognize-stream.js.html">speech-to-text/recognize-stream.js</a>, <a href="speech-to-text_recognize-stream.js.html#line277">line 277</a> </li></ul></dd> </dl> <h4 class="name" id="event:stop">stop</h4> <div class="description"> <p>Event emitted when the stop method is called. Mainly for synchronising with file reading and playback.</p> </div> <dl class="details"> <dt class="tag-source">Source:</dt> <dd class="tag-source"><ul class="dummy"><li> <a href="speech-to-text_recognize-stream.js.html">speech-to-text/recognize-stream.js</a>, <a href="speech-to-text_recognize-stream.js.html#line364">line 364</a> </li></ul></dd> </dl> </article> </section> </div> <nav> <h2><a href="index.html">Home</a></h2><h3>Modules</h3><ul><li><a href="module-watson-speech.html">watson-speech</a></li><li><a href="module-watson-speech_speech-to-text.html">watson-speech/speech-to-text</a></li><li><a href="module-watson-speech_speech-to-text_get-models.html">watson-speech/speech-to-text/get-models</a></li><li><a href="module-watson-speech_speech-to-text_recognize-file.html">watson-speech/speech-to-text/recognize-file</a></li><li><a href="module-watson-speech_speech-to-text_recognize-microphone.html">watson-speech/speech-to-text/recognize-microphone</a></li><li><a href="module-watson-speech_text-to-speech.html">watson-speech/text-to-speech</a></li><li><a href="module-watson-speech_text-to-speech_get-voices.html">watson-speech/text-to-speech/get-voices</a></li><li><a href="module-watson-speech_text-to-speech_synthesize.html">watson-speech/text-to-speech/synthesize</a></li></ul><h3>Classes</h3><ul><li><a href="FilePlayer.html">FilePlayer</a></li><li><a href="FormatStream.html">FormatStream</a></li><li><a href="RecognizeStream.html">RecognizeStream</a></li><li><a href="ResultStream.html">ResultStream</a></li><li><a href="SpeakerStream.html">SpeakerStream</a></li><li><a href="TimingStream.html">TimingStream</a></li><li><a href="UrlPlayer.html">UrlPlayer</a></li><li><a href="WebAudioL16Stream.html">WebAudioL16Stream</a></li><li><a href="WritableElementStream.html">WritableElementStream</a></li></ul><h3>Events</h3><ul><li><a href="RecognizeStream.html#event:close">close</a></li><li><a href="RecognizeStream.html#event:data">data</a></li><li><a href="RecognizeStream.html#event:error">error</a></li><li><a href="RecognizeStream.html#event:listening">listening</a></li><li><a href="RecognizeStream.html#event:message">message</a></li><li><a href="RecognizeStream.html#event:open">open</a></li><li><a href="RecognizeStream.html#event:send-data">send-data</a></li><li><a href="RecognizeStream.html#event:send-json">send-json</a></li><li><a href="RecognizeStream.html#event:stop">stop</a></li><li><a href="SpeakerStream.html#event:data">data</a></li></ul><h3>Global</h3><ul><li><a href="global.html#getContentTypeFromFile">getContentTypeFromFile</a></li><li><a href="global.html#playFile">playFile</a></li></ul> </nav> <br class="clear"> <footer> Documentation generated by <a href="https://github.com/jsdoc3/jsdoc">JSDoc 3.4.3</a> on Tue Feb 21 2017 17:41:51 GMT+0000 (UTC) </footer> <script> prettyPrint(); </script> <script src="scripts/linenumber.js"> </script> </body> </html>