UNPKG

regexp-stream-tokenizer

Version:
59 lines (39 loc) 2.41 kB
# regexp-stream-tokenizer [![Version](https://img.shields.io/npm/v/regexp-stream-tokenizer.svg)](https://npmjs.com/package/regexp-stream-tokenizer) [![License](https://img.shields.io/npm/l/regexp-stream-tokenizer.svg)](https://npmjs.com/package/regexp-stream-tokenizer) [![Build Status](https://img.shields.io/travis/jamesramsay/regexp-stream-tokenizer.svg)](https://travis-ci.org/jamesramsay/regexp-stream-tokenizer) [![Coverage Status](https://img.shields.io/codecov/c/github/jamesramsay/regexp-stream-tokenizer.svg)](https://codecov.io/github/jamesramsay/regexp-stream-tokenizer) [![Dependency Status](https://img.shields.io/david/jamesramsay/regexp-stream-tokenizer.svg)](https://david-dm.org/jamesramsay/regexp-stream-tokenizer) [![NPM](https://nodei.co/npm/regexp-stream-tokenizer.png)](https://nodei.co/npm/regexp-stream-tokenizer/) This is a simple regular expression based tokenizer for streams. **IMPORTANT:** If you return `null` from your function, the stream will end there. **IMPORTANT:** Only supports object mode streams. ```javascript var tokenizer = require("regexp-stream-tokenizer"); var words = tokenizer(/w+/g); // Sink receives tokens: 'The', 'quick', 'brown', 'fox', 'jumps', 'over', 'the', 'lazy', 'dog' words.write('The quick brown fox jumps over the lazy dog'); words.pipe(sink) // Separators are excluded by default, but can be included var wordsAndSeparators = tokenizer({ separator: true }, /w+/g); // Sink receives tokens: 'The', ' ', 'quick', ' ', 'brown', ' ', 'fox', ' ', 'jumps', ' ', 'over', ... words.write('The quick brown fox jumps over the lazy dog'); words.pipe(sink) ``` ## API ```javascript require("regexp-stream-tokenizer")([options,] regexp) ``` Create a `stream.Transform` instance with `objectMode: true` that will tokenize the input stream using the regexp. ```javascript var Tx = require("regexp-stream-tokenizer").ctor([options,] regexp) ``` Create a reusable `stream.Transform` TYPE that can be called via `new Tx` or `Tx()` to create an instance. __Arguments__ - `options` - `excludeZBS` (boolean): defaults `true`. - `token` (boolean|string|function): defaults `true`. - `separator` (boolean|string|function): defaults `false`. - `leaveBehind` (string|Array): optionally provides pseudo-lookbehind support. - all other through2 options. - `regexp` (RegExp): The regular expression using which the stream will be tokenized.