intent-tools
Version:
Tools for processing Rasa.ai JSONs to text learning sets for FB FastText
216 lines (152 loc) • 6.15 kB
Markdown
# Intent tools
Tools for processing intents from RASA JSONs to Facebook fast-text learning sets.
Because there is good GUI: [Rasa NLU Trainer](https://rasahq.github.io/rasa-nlu-trainer/)
## CLI Usage
- **Convert RASA json to fast-text learning set**
```bash
$ intools jsonToText ./testData.json ./testData.txt
```
- **Convert RASA json to fast-text learning set and multiply by entities**
```bash
$ intools jsonToText -m ./testData.json ./testData.txt
```
- **Make word vectors learning set from wiki XML export**
```bash
$ intools wikiToText ./testData.xml ./testData.txt
```
-----------------
# API
## Classes
<dl>
<dt><a href="#MultiplicatorStream">MultiplicatorStream</a></dt>
<dd></dd>
<dt><a href="#MultiplicatorStream">MultiplicatorStream</a></dt>
<dd></dd>
<dt><a href="#Pipeline">Pipeline</a></dt>
<dd></dd>
</dl>
## Functions
<dl>
<dt><a href="#jsonToText">jsonToText(input, output, [pipeline], [mapFn])</a> ⇒ <code>Promise</code></dt>
<dd><p>Create fast-text learning set from Rasa intents json</p>
</dd>
<dt><a href="#wikiToText">wikiToText(input, output, [mapFn])</a> ⇒ <code>Promise</code></dt>
<dd><p>Create a pretrained word vectors learning set from Wikipedia XML dump</p>
</dd>
<dt><a href="#normalize">normalize(str)</a></dt>
<dd><p>Preserves only letters (with or withour diacritics) and makes everything lowercased</p>
</dd>
</dl>
<a name="MultiplicatorStream"></a>
## MultiplicatorStream
**Kind**: global class
* [MultiplicatorStream](#MultiplicatorStream)
* [new MultiplicatorStream()](#new_MultiplicatorStream_new)
* [new MultiplicatorStream(getVariants)](#new_MultiplicatorStream_new)
<a name="new_MultiplicatorStream_new"></a>
### new MultiplicatorStream()
Multiplicates a learning set data with available entities information
**Example**
```javascript
const path = require('path');
const { EntitiesFromJson, MultiplicatorStream, jsonToText } = require('intent-tools');
const from = path.resolve(process.cwd(), 'sample.json');
const to = path.resolve(process.cwd(), 'trainingData.txt');
const entities = new main.EntitiesFromJson(from);
const pipeline = [
new MultiplicatorStream((cat, word) => entities.getWordList(cat, word))
];
entities.loadEntities()
.then(() => main.jsonToText(from, to, pipeline))
.catch(e => console.error(e));
```
<a name="new_MultiplicatorStream_new"></a>
### new MultiplicatorStream(getVariants)
| Param | Type |
| --- | --- |
| getVariants | <code>function</code> |
<a name="MultiplicatorStream"></a>
## MultiplicatorStream
**Kind**: global class
* [MultiplicatorStream](#MultiplicatorStream)
* [new MultiplicatorStream()](#new_MultiplicatorStream_new)
* [new MultiplicatorStream(getVariants)](#new_MultiplicatorStream_new)
<a name="new_MultiplicatorStream_new"></a>
### new MultiplicatorStream()
Multiplicates a learning set data with available entities information
**Example**
```javascript
const path = require('path');
const { EntitiesFromJson, MultiplicatorStream, jsonToText } = require('intent-tools');
const from = path.resolve(process.cwd(), 'sample.json');
const to = path.resolve(process.cwd(), 'trainingData.txt');
const entities = new main.EntitiesFromJson(from);
const pipeline = [
new MultiplicatorStream((cat, word) => entities.getWordList(cat, word))
];
entities.loadEntities()
.then(() => main.jsonToText(from, to, pipeline))
.catch(e => console.error(e));
```
<a name="new_MultiplicatorStream_new"></a>
### new MultiplicatorStream(getVariants)
| Param | Type |
| --- | --- |
| getVariants | <code>function</code> |
<a name="Pipeline"></a>
## Pipeline
**Kind**: global class
* [Pipeline](#Pipeline)
* [new Pipeline()](#new_Pipeline_new)
* [.add(pipe)](#Pipeline+add) ⇒ <code>this</code>
* [.promise()](#Pipeline+promise) ⇒ <code>promise</code>
<a name="new_Pipeline_new"></a>
### new Pipeline()
Simple tool, which creates a Promise from pipeline of streams
<a name="Pipeline+add"></a>
### pipeline.add(pipe) ⇒ <code>this</code>
Append a stream
**Kind**: instance method of [<code>Pipeline</code>](#Pipeline)
| Param | Type | Description |
| --- | --- | --- |
| pipe | <code>ReadableStream</code> \| <code>Writable</code> | the transform stream |
<a name="Pipeline+promise"></a>
### pipeline.promise() ⇒ <code>promise</code>
Get a promise
**Kind**: instance method of [<code>Pipeline</code>](#Pipeline)
<a name="jsonToText"></a>
## jsonToText(input, output, [pipeline], [mapFn]) ⇒ <code>Promise</code>
Create fast-text learning set from Rasa intents json
**Kind**: global function
| Param | Type | Default | Description |
| --- | --- | --- | --- |
| input | <code>string</code> \| <code>ReadableStream</code> | | path of Rasa intent set or stream |
| output | <code>string</code> \| <code>Writable</code> | | path or stream to write fast-text learning set |
| [pipeline] | <code>Array</code> | | array of transform streams to modify the learning set |
| [mapFn] | <code>function</code> | <code></code> | text normalizer function |
**Example**
```javascript
const path = require('path');
const { jsonToText } = require('intent-tools');
const from = path.resolve(process.cwd(), 'sample.json');
const to = path.resolve(process.cwd(), 'trainingData.txt');
main.jsonToText(from, to)
.catch(e => console.error(e));
```
<a name="wikiToText"></a>
## wikiToText(input, output, [mapFn]) ⇒ <code>Promise</code>
Create a pretrained word vectors learning set from Wikipedia XML dump
**Kind**: global function
| Param | Type | Default | Description |
| --- | --- | --- | --- |
| input | <code>string</code> \| <code>ReadableStream</code> | | path of Rasa intent set or stream |
| output | <code>string</code> \| <code>Writable</code> | | path or stream to write fast-text learning set |
| [mapFn] | <code>function</code> | <code></code> | text normalizer function |
<a name="normalize"></a>
## normalize(str)
Preserves only letters (with or withour diacritics) and makes everything lowercased
**Kind**: global function
**Returs**: <code>string</code>
| Param | Type | Description |
| --- | --- | --- |
| str | <code>string</code> | input string |