UNPKG

restream

Version:

Regular Expression Detection & Replacement streams.

1,195 lines (952 loc) 46.1 kB
<div align="center"><a align="center"> # restream [![npm version](https://badge.fury.io/js/restream.svg)](https://www.npmjs.com/package/restream) <a href="https://gitlab.com/artdeco/restream/-/commits/master"> <img src="https://gitlab.com/artdeco/restream/badges/master/pipeline.svg" alt="Pipeline Badge"> </a> </p></div> **Restream**: Regular expression detection implemented as a `Transform` steam; _and_ **Replaceable**: Regex-based replacement stream to update incoming data on-the-fly (possibly with async functions). Transforms data from a stream using a set of regular expressions. Allows to build complex pipelines for transforming string using cut-and-paste rules to prevent certain rules to work on undesired piece of input; _including_ - **SyncReplaceable**: The synchronous version of the _Replaceable_ that that is just a function and not a stream. Returns the result immediately and is deterministic; - **SerialAsyncReplaceable**: When rules use asynchronous replacements, the `serial-async` instance provides a way to run replacements detected with global regular expression one by one rather than in parallel. --- ## How _Replaceable_ Works <img src="https://artdeco.gitlab.io/restream/Replaceable.png" alt="Replaceable diagram"> ``` yarn add restream npm i restream ``` <div align="center"><p align="center"><a href="#table-of-contents"> <img alt="section break" src="https://artdeco.gitlab.io/restream/section-breaks/0.svg"> </a></p></div> ## Table of Contents - [How _Replaceable_ Works](#how-replaceable-works) - [Table of Contents](#table-of-contents) - [API](#api) - [`restream(regex: !RegExp): !stream.Transform`](#restreamregex-regexp-streamtransform) - [`Replaceable` Class](#replaceable-class) * [`Replaceable`](#type-replaceable) * [`Rule` Type](#rule-type) * [<strong><code>re*</code></strong>](#re) * [<strong><code>replacement*</code></strong>](#replacement) * [`String` Replacement](#string-replacement) * [`Function` Replacer](#function-replacer) * [`Async Function` Replacer](#async-function-replacer) - [`constructor(rules: !(Rule|Array<!Rule>), options=: !stream.TransformOptions)`](#constructorrules-rulearrayruleoptions-streamtransformoptions-replaceable) * [`Replacer` Context](#replacer-context) - [`brake(): void`](#brake-void) - [`async replace(data: !(string|Buffer|stream.Stream), context=: !Object<string, *>): !Promise<string>`](#async-replacedata-stringbufferstreamstreamcontext-objectstring--promise) * [`Replacer` Errors](#replacer-errors) * [`static replace`](#static-replace) * [Collecting Into Catchment](#collecting-into-catchment) - [`SerialAsyncReplaceable` Class](#serialasyncreplaceable-class) * [`SerialAsyncReplaceable`](#type-serialasyncreplaceable) - [`restream(regex: !RegExp): !stream.Transform`](#restreamregex-regexp-streamtransform) - [Markers](#markers) * [`makeMarkers(matchers, config=): !Object.<string, !Marker>`](#makemarkersmatchers-objectstring-regexpconfig-makemarkersconfig-objectstring-marker) * [`MakeMarkersConfig`](#type-makemarkersconfig) * [`makeCutRule(marker): !Rule`](#makecutrulemarker-marker-rule) * [`makePasteRule(marker, pipeRules=): !Rule`](#makepasterulemarker-markerpiperules-rulearrayrule-rule) * [Accessing Replacements](#accessing-replacements) - [Related Packages](#related-packages) - [Copyright & License](#copyright--license) <div align="center"><p align="center"><a href="#table-of-contents"> <img alt="section break" src="https://artdeco.gitlab.io/restream/section-breaks/1.svg"> </a></p></div> ## API The package contains the default `restream` function and a family of `Replaceable` classes, as well as functions to create [markers](#markers) and their cut and paste rules. The `replace` function can be used to end a replaceable instance with some data to transform it. ```js import restream, { Replaceable, SyncReplaceable, SerialAsyncReplaceable, makeMarkers, makeCutRule, makePasteRule, replace, } from 'restream' ``` The types and [externs](externs.js) for _Google Closure Compiler_ via [**_Depack_**](https://github.com/dpck/depack) are defined in the `_restream` namespace. <div align="center"><p align="center"><a href="#table-of-contents"> <img alt="section break" src="https://artdeco.gitlab.io/restream/section-breaks/2.svg"> </a></p></div> ## <code><ins>restream</ins>(</code><sub><br/>&nbsp;&nbsp;`regex: !RegExp,`<br/></sub><code>): <i>!stream.Transform</i></code> Create a _Transform_ stream which will maintain a buffer with data received from a _Readable_ stream and write data when the buffer can be matched against the regex. It will push the whole match object (or objects when the g flag is used) returned by `/regex/.exec(buffer)`. - <kbd><strong>regex*</strong></kbd> <em>`!RegExp`</em>: The regular expression to execute. The `Transform` stream will buffer incoming data and push regex results when matches can be made, i.e. when `regex.exec` returns non-null value. When the `g` flag is added to the regex, multiple matches will be detected. ```js import restream from 'restream' (async () => { try { const rs = createReadable('test-string-{12345}-{67890}') const stream = restream(/{(\d+)}/g) // create a transform stream rs.pipe(stream) const { data, ws } = createWritable() stream.pipe(ws) ws.once('finish', () => { console.log(data) }) } catch (err) { console.error(err) } })() ``` ```js [ [ '{12345}', '12345', index: 12, input: 'test-string-{12345}-{67890}', groups: undefined ], [ '{67890}', '67890', index: 20, input: 'test-string-{12345}-{67890}', groups: undefined ] ] ``` <div align="center"><p align="center"><a href="#table-of-contents"> <img alt="section break" src="https://artdeco.gitlab.io/restream/section-breaks/3.svg"> </a></p></div> ## `Replaceable` Class A _Replaceable_ transform stream can be used to transform data according to a single or multiple rules. <strong><a name="type-replaceable">`Replaceable`</a> extends <a href="https://nodejs.org/api/stream.html#stream_class_stream_transform" title="A duplex stream that receives data as Writable, transforms this data, and pushes it as Readable via the `transform` method implementation."><code><img src=".documentary/type-icons/node.png" alt="Node.JS Docs">stream.Transform</code></a></strong>: An interface for the context accessible via this in replacer functions. | Name | Type | Description | | --------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | __constructor__ | <em>new (rules: !(Rule \| Array&lt;!Rule&gt;), options?: !stream.TransformOptions) => <a href="#type-replaceable" title="An interface for the context accessible via this in replacer functions.">Replaceable</a></em> | Constructor method. | | __brake__ | <em>() => void</em> | After calling this method, any of the following rules and matches within the same rule won't be able to make any more changes. | | __replace__ | <em>(data: !(string \| Buffer \| <a href="https://nodejs.org/api/stream.html#stream" title="Handles streaming data in Node.JS."><img src=".documentary/type-icons/node-odd.png" alt="Node.JS Docs">stream.Stream</a>), context?: !Object&lt;string, *&gt;) => !Promise&lt;string&gt;</em> | Creates a new replaceable to replace the given string, buffer or stream using the rules of the current stream. Calling `brake` will also set `_broke` on the parent stream. The new _Replaceable_ will copy the rules, and be assigned the context to it before replacing data. The `this` won't be shared by parent and child rules, but the context will be updated: `const context = { test: this.test }; content = await this.replace(content, context); this.test = context.test`. | <div align="center"><p align="center"><a href="#table-of-contents"> <img width="25" alt="section break" src="https://artdeco.gitlab.io/restream/section-breaks/4.svg"> </a></p></div> ### `Rule` Type _Replaceable_ uses rules to determine how to transform data. Below is the description of the `Rule` type. <table> <thead> <tr> <th>Property</th> <th>Type</th> <th>Description</th> <th>Example</th> </tr> </thead> <tbody> <tr> <td><a name="re"><strong><code>re*</code></strong></a></td> <td><em>RegExp</em></td> <td>A regular expression.</td> <td>Detect inline code blocks in markdown: <code>/`(.+?)`/</code>.</td> </tr> <tr> <td><a name="replacement"><strong><code>replacement*</code></strong></a></td> <td><em>string | function | async function</em></td> <td>A replacer either as a <a href="#string-replacement">string</a>, <a href="#function-replacer">function</a>, or <a href="#async-function-replacer">async function</a>. It will be passed to the <code>string.replace(re, replacement)</code> native JavaScript method.</td> <td>As a string: <code>INLINE_CODE</code>.</td> </tr> </tbody> </table> ##### `String` Replacement Replacement as a string. Given a simple string, it will replace a match detected by the rule's regular expression, without consideration for the capturing groups. ##### `Function` Replacer Replacement as a function. See [MDN](https://developer.mozilla.org/en/docs/Web/JavaScript/Reference/Global_Objects/String/replace#Specifying_a_function_as_a_parameter) for more documentation on how the replacer function should be implemented. The example below allows to replace strings like `%NPM: documentary%` and `%NPM: @rqt/aqt%` into a markdown badge (used in [`documentary`](https://www.npmjs.com/package/documentary)). <table> <tr></tr> <tr><td> ```js const syncRule = { re: /^%NPM: ((?:[@\w\d-_]+\/)?[\w\d-_]+)%$/gm, replacement(match, name) { const n = encodeURIComponent(name) const svg = `https://badge.fury.io/js/${n}.svg` const link = `https://npmjs.org/package/${name}` return `[![npm version](${svg})](${link})` }, } ``` </td></tr> </table> ##### `Async Function` Replacer An asynchronous function to get replacements. The stream won't push any data until the replacer's promise is resolved. Due to implementation details, the regex will have to be run against incoming chunks twice, therefore it might be not ideal for heavy-load applications with many matches. This example will replace strings like `%FORK-js: example example/Replaceable.js%` into the output of a forked JavaScript program (used in [`documentary`](https://www.npmjs.com/package/documentary)). <table> <tr></tr> <tr><td> ```js import { fork } from 'spawncommand' const codeSurround = (m, lang = '') => `\`\`\`${lang}\n${m.trim()}\n\`\`\`` const forkRule = { re: /%FORK(?:-(\w+))? (.+)%/mg, async replacement(match, lang, m) { const [mod, ...args] = m.split(' ') const { promise } = fork(mod, args, { execArgv: [], stdio: 'pipe', }) const { stdout } = await promise return codeSurround(stdout, lang) }, } ``` </td></tr> </table> <div align="center"><p align="center"><a href="#table-of-contents"> <img width="25" alt="section break" src="https://artdeco.gitlab.io/restream/section-breaks/5.svg"> </a></p></div> ## <code><ins>constructor</ins>(</code><sub><br/>&nbsp;&nbsp;`rules: !(Rule|Array<!Rule>),`<br/>&nbsp;&nbsp;`options=: !stream.TransformOptions,`<br/></sub><code>): <i>Replaceable</i></code> Constructor method. - <kbd><strong>rules*</strong></kbd> <em><code>!(Rule \| Array&lt;!Rule&gt;)</code></em>: An array with rules, or a single rule. - <kbd>options</kbd> <em>`!stream.TransformOptions`</em> (optional): Options for the transform stream. Create a _Transform_ stream which will make data available when an incoming chunk has been updated according to the specified rule or rules. The second argument will be passed as options to the _Transform_ constructor if specified. Matches can be replaced using a string, function or async function. When multiple rules are passed as an array, the string will be replaced multiple times if the latter rules also modify the data. ```js import { Replaceable } from 'restream' const dateRule = { re: /%DATE%/g, replacement: new Date().toLocaleString(), } const emRule = { re: /__(.+?)__/g, replacement(match, p1) { return `<em>${p1}</em>` }, } const authorRule = { re: /^%AUTHOR_ID: (.+?)%$/mg, async replacement(match, id) { const name = await new Promise(resolve => { // pretend to lookup author name from the database const authors = { 5: 'John' } resolve(authors[id]) }) return `Author: <strong>${name}</strong>` }, } const STRING = ` Hello __Fred__, your username is __fred__. You have __5__ stars. %AUTHOR_ID: 5% on __%DATE%__ ` const replaceable = new Replaceable([ dateRule, emRule, authorRule, ]) const rs = createReadable(STRING) rs .pipe(replaceable) .pipe(process.stdout) ``` Output: ```html Hello <em>Fred</em>, your username is <em>fred</em>. You have <em>5</em> stars. Author: <strong>John</strong> on <em>5/2/2020, 19:55:12</em> ``` <div align="center"><p align="center"><a href="#table-of-contents"> <img width="25" alt="section break" src="https://artdeco.gitlab.io/restream/section-breaks/6.svg"> </a></p></div> ### `Replacer` Context Replacer functions will be executed with their context set to the _Replaceable_ instance to which they belong. Both `sync` and `async` replacers can use the `this` keyword to access their _Replaceable_ instance and modify its properties and/or emit events. This is done so that there's a mechanism by which replacers can share data between themselves. For example, we might want to read and parse an external file first, but remember its data for use in following replacers. Given an external file `example/types.json`: ```json { "TypeA": "A new type with certain properties.", "TypeB": "A type to represent the state of the world." } ``` _Replaceable_ can read it in the first `typesRule` rule, and reference its data in the second `paramRule` rule: ```js /** yarn example/context.js */ import { collect } from 'catchment' import { createReadStream } from 'fs' import { Replaceable } from 'restream' import { createReadable } from './lib' const typesRule = { re: /^%types: (.+?)%$/mg, async replacement(match, location) { const rs = createReadStream(location) const d = await collect(rs) const j = JSON.parse(d) this.types = j // remember types for access in following rules return match }, } const paramRule = { re: /^ \* @typedef {(.+?)} (.+)(?: .*)?/mg, replacement(match, type, typeName) { const description = this.types[typeName] if (!description) return match return ` * @typedef {${type}} ${typeName} ${description}` }, } const STRING = ` %types: example/types.json% /** * @typedef {Object} TypeA */ ` const replaceable = new Replaceable([ typesRule, paramRule, ]) const rs = createReadable(STRING) rs .pipe(replaceable) .pipe(process.stdout) ``` ```js %types: example/types.json% /** * @typedef {Object} TypeA A new type with certain properties. */ ``` As can be seen above, the description of the type was automatically updated based on the data read from the file. All methods on the Replaceable instance can be accessed via `this`. <div align="center"><p align="center"><a href="#table-of-contents"> <img width="25" alt="section break" src="https://artdeco.gitlab.io/restream/section-breaks/7.svg"> </a></p></div> ## <code><ins>brake</ins>(): <i>void</i></code> After calling this method, any of the following rules and matches within the same rule won't be able to make any more changes. The `brake` method allows to stop further rules from processing incoming chunks. If a replacer function is run with a global regex, the succeeding replacements will also have no effect. ```js import { Replaceable } from 'restream' (async () => { const replaceable = new Replaceable([ { re: /AAA/g, replacement() { this.brake() // prevent further replacements return 'BBB' }, }, { re: /AAA/g, replacement() { return 'RRR' }, }, ]) replaceable.pipe(process.stdout) replaceable.end('AAA AAA AAA AAA') })() ``` ``` BBB AAA AAA AAA ``` <div align="center"><p align="center"><a href="#table-of-contents"> <img width="25" alt="section break" src="https://artdeco.gitlab.io/restream/section-breaks/8.svg"> </a></p></div> ## <code>async <ins>replace</ins>(</code><sub><br/>&nbsp;&nbsp;`data: !(string|Buffer|stream.Stream),`<br/>&nbsp;&nbsp;`context=: !Object<string, *>,`<br/></sub><code>): <i>!Promise<string></i></code> Creates a new replaceable to replace the given string, buffer or stream using the rules of the current stream. Calling `brake` will also set `_broke` on the parent stream. The new _Replaceable_ will copy the rules, and be assigned the context to it before replacing data. The `this` won't be shared by parent and child rules, but the context will be updated: `const context = { test: this.test }; content = await this.replace(content, context); this.test = context.test`. - <kbd><strong>data*</strong></kbd> <em><code>!(string \| Buffer \| <a href="https://nodejs.org/api/stream.html#stream" title="Handles streaming data in Node.JS."><img src=".documentary/type-icons/node.png" alt="Node.JS Docs">stream.Stream</a>)</code></em>: The input data to replace via forked _Replaceable_. - <kbd>context</kbd> <em><code>!Object&lt;string, *&gt;</code></em> (optional): The context to assign to the new _Replaceable_. The rules can recursively spawn new instances of the _Replaceable_ instance without having to implement them manually. For example, we might detect a match where the content potentially has other matches, but the regex only works on the outer one. In such cases, the async `replace` method can be used. ```js import { Replaceable } from 'restream' const replaceable = new Replaceable({ re: /<(.+?)>([\s\S]+)<\/\1>/gm, async replacement(m, tag, content) { content = await this.replace(content) return `<${tag}-replaced>${content}</${tag}-replaced>` }, }) const html = `<div> <span>Hello World</span> </div>` const naive = html.replace(/<(.+?)>([\s\S]+)<\/\1>/gm, (m, tag, content) => { console.log('Plain regexp detected tag <%s>', tag) // even if the actual match is returned, the inner tag won't be detected return `<${tag}-replaced>${content}</${tag}-replaced>` }) console.log('Only the outer match is detected: %s\n---', naive) ;(async () => { const res = await Replaceable.replace(replaceable, html) console.log('replaceable.replace finds matches in children:', res) })() ``` ```html Plain regexp detected tag <div> Only the outer match is detected: <div-replaced> <span>Hello World</span> </div-replaced> --- replaceable.replace finds matches in children: <div-replaced> <span-replaced>Hello World</span-replaced> </div-replaced> ``` It supports passing of the `context` argument because the child rules don't inherit the `this` property (this might change in the next version). However, since the `replace` method is async, the properties access to which is shared by rules (either siblings, or children/parents) must be accessed via an object, because otherwise it's going to be the values of parallel lane contexts that get modified and not the overall context (as shown by the last detection on the example below). ```js import { Replaceable } from 'restream' const replaceable = new Replaceable({ re: /<(.+?)>([\s\S]+)<\/\1>/gm, async replacement(m, tag, content) { console.log('Total found: %s, replacer lane: %s [%s]', this.context.found, this.lane, tag) if (this.context.found > 2) { this.brake() return m } this.context.found++ this.lane++ content = await this.replace(content, { context: this.context, lane: this.lane, }) return `<${tag}-replaced>${content}</${tag}-replaced>` }, }) const html = `<div> <details> <summary>Restream</summary> 2019 </details> <span>Hello World</span> <address>London</address> <em>Art Deco</em> </div>` ;(async () => { replaceable.context = { found: 0 } replaceable.lane = 0 const res = await Replaceable.replace(replaceable, html) console.log() console.log(res) })() ``` ```html Total found: 0, replacer lane: 0 [div] Total found: 1, replacer lane: 1 [details] Total found: 2, replacer lane: 2 [span] Total found: 3, replacer lane: 3 [address] Total found: 3, replacer lane: 2 [summary] <div-replaced> <details-replaced> <summary>Restream</summary> 2019 </details-replaced> <span-replaced>Hello World</span-replaced> <address>London</address> <em>Art Deco</em> </div-replaced> ``` <div align="center"><p align="center"><a href="#table-of-contents"> <img width="25" alt="section break" src="https://artdeco.gitlab.io/restream/section-breaks/9.svg"> </a></p></div> ### `Replacer` Errors If an error happens in a `sync` or `async` replacer function, the `Replaceable` will emit it and close. ```js /** yarn example/errors.js */ import { Replaceable } from 'restream' import { createReadable } from './lib' const replace = () => { throw new Error('An error occurred during a replacement.') } (async () => { const rs = createReadable('example-string') const replaceable = new Replaceable([ { re: /.*/, replacement(match) { return replace(match) }, }, ]) rs .pipe(replaceable) .on('error', (error) => { console.log(error) }) })() ``` ```js Error: An error occurred during a replacement. at replace (/Users/anton/artdeco/restream/example/errors.js:6:9) at Replaceable.replacement (/Users/anton/artdeco/restream/example/errors.js:16:16) ``` <div align="center"><p align="center"><a href="#table-of-contents"> <img width="25" alt="section break" src="https://artdeco.gitlab.io/restream/section-breaks/10.svg"> </a></p></div> ### `static replace` The static `.replace` method allows to feed data into the stream and wait until it finishes execution. This works for strings, buffers and streams. ```js import { Replaceable } from 'restream' import { Readable } from 'stream' const example = { get replaceable() { const r = new Replaceable({ re: /hello/, replacement: 'hi', }) return r }, } ;(async () => { const string = await Replaceable.replace( example.replaceable, 'hello string world') console.log(string) const buffer = await Replaceable.replace( example.replaceable, new Buffer('hello buffer world')) console.log(buffer) const stream = await Replaceable.replace( example.replaceable, new Readable({ read() { this.push('hello stream world') this.push(null) }, })) console.log(stream) })() ``` ``` hi string world hi buffer world hi stream world ``` <div align="center"><p align="center"><a href="#table-of-contents"> <img alt="section break" src="https://artdeco.gitlab.io/restream/section-breaks/11.svg"> </a></p></div> ### Collecting Into Catchment > Since Replaceable supports static `.replace`, this is not particularly relevant, however can help in certain scenarios. To be able to collect stream data into memory, the [`catchment`](https://github.com/artdecocode/catchment) package can be used. It will create a promise resolved when the stream finishes. ```js import { Replaceable } from 'restream' import Catchment, { collect } from 'catchment' import { equal } from 'assert' //0. SETUP: create a replaceable and readable input streams, // and pipe the input stream into the replaceable. const replaceable = new Replaceable([ { re: /hello/i, replacement() { return 'WORLD' }, }, { re: /world/, replacement() { return 'hello' }, }, ]) const rs = createReadable('HELLO world') rs .pipe(replaceable) // 1. Create a writable catchment using constructor. const catchment = new Catchment() replaceable.pipe(catchment) // OR 1. Create a writable catchment and automatically // pipe into it. const { promise } = new Catchment({ rs: replaceable, }) // OR 1+2. Use the collect method which uses a catchment // internally. const data = await collect(replaceable) // 2. WAIT for the catchment streams to finish. const data2 = await catchment.promise const data3 = await promise // Validate that results are the same. equal(data, data2); equal(data2, data3) console.log(data) ``` ``` WORLD hello ``` <div align="center"><p align="center"><a href="#table-of-contents"> <img alt="section break" src="https://artdeco.gitlab.io/restream/section-breaks/12.svg"> </a></p></div> ## `SerialAsyncReplaceable` Class __<a name="type-serialasyncreplaceable">`SerialAsyncReplaceable`</a> extends <a href="#type-replaceable" title="An interface for the context accessible via this in replacer functions.">`Replaceable`</a>__: A class for when serial execution of asynchronous replacements within the same rule are needed. | Name | Type | Description | | --------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | __constructor__ | <em>new () => <a href="#type-serialasyncreplaceable" title="A class for when serial execution of asynchronous replacements within the same rule are needed.">SerialAsyncReplaceable</a></em> | Constructor method. | | __addItem__ | <em>() => !Promise&lt;string&gt;</em> | &lt;callback async return="string" name="link"&gt;<br/> An async replacer function to be executed when all previous links in the chain have resolved.<br/>&lt;/callback&gt; | The _SerialAsyncReplaceable_ can be used whenever there are multiple detections by the same rule that need to be run asynchronously one after another rather than in parallel. This can be achieved by calling `this.addItem(...)` method on the class and awaiting on the returned promise. Behind the scenes, each replacement will await on the collective promise from previous replacements. ```js let s = new Date().getTime() const replaceable = new SerialAsyncReplaceable([ // 1. Use the `this.addItem` method to set up the await chain. { re: /---/g, async replacement() { const res = await this.addItem(async () => { await new Promise(r => setTimeout(r, 100)) const d = new Date().getTime() const delta = d - s return delta }) return res }, }, // 2. All async replacement without `this.addItem` will run in parallel. { re: /___/g, async replacement() { await new Promise(r => setTimeout(r, 100)) const d = new Date().getTime() const delta = d - s return delta }, }, ]) replaceable .pipe(process.stdout) replaceable.end(input) ``` ``` Test: serial 155ms, parallel 467ms, Example: serial 256ms, parallel 467ms, Total: serial 362ms, parallel 467ms, ``` <div align="center"><p align="center"><a href="#table-of-contents"> <img alt="section break" src="https://artdeco.gitlab.io/restream/section-breaks/13.svg"> </a></p></div> ## <code><ins>restream</ins>(</code><sub><br/>&nbsp;&nbsp;`regex: !RegExp,`<br/></sub><code>): <i>!stream.Transform</i></code> Create a _Transform_ stream which will maintain a buffer with data received from a _Readable_ stream and write data when the buffer can be matched against the regex. It will push the whole match object (or objects when the g flag is used) returned by `/regex/.exec(buffer)`. - <kbd><strong>regex*</strong></kbd> <em>`!RegExp`</em>: The regular expression to execute. The _SyncReplaceable_ can be used when data is already stored on memory (for example, if you're running an Azure function with Node.JS and it doesn't support streaming), and needs to be transformed using the synchronous flow. This implies that the rules cannot contain asynchronous replacers. ```js /** yarn e example/sync.js */ import { SyncReplaceable } from 'replaceable' const n = ['zero', 'one', 'two', 'three', 'four', 'five', 'six', 'seven', 'eight', 'nine'] const input = `Test String: {12345} Example Test: {67890}` const res = SyncReplaceable(input, [ // The rule to map numbers into their names. { re: /{(\d+)}/g, replacement(match, num) { return num.split('').map((nn) => { return n[nn] }).join(', ') }, }, // The rule to end every line with a dot. { re: /^[\s\S]*$/, replacement(match) { return match .split('\n') .map(a => `${a}.`) .join('\n') }, }, ]) ``` ``` Test String: one, two, three, four, five. Example Test: six, seven, eight, nine, zero. ``` <div align="center"><p align="center"><a href="#table-of-contents"> <img alt="section break" src="https://artdeco.gitlab.io/restream/section-breaks/14.svg"> </a></p></div> ## Markers Markers can be used to cut some portion of input text according to a regular expression, run necessary replacement rules on the remaining parts, and then restore the cut chunks. In this way, those chunks do not take part in transformations produced by rules, and can be re-inserted into the stream in their original form. An example use case would be a situation when markdown code blocks need to be transformed into html, however those code blocks don't need to be processed when inside of a comment, such as: ```markdown <!-- The following line should be preserved: **Integrity is the ability to stand by an idea.** --> But the next lines should be transformed into HTML: **Civilization is the process of setting man free from men.** **Every building is like a person. Single and unrepeatable.** ``` When using a naïve transformation with a replacement rule for changing `**` into `<strong>`, both lines will be transformed. ```js import { Replaceable } from 'restream' import { createReadStream } from 'fs' const FILE = 'example/markers/example.md' const strongRule = { re: /\*\*(.+?)\*\*/g, replacement(match, p1) { return `<strong>${p1}</strong>` }, } ;(async () => { const rs = createReadStream(FILE) const replaceable = new Replaceable(strongRule) rs .pipe(replaceable) .pipe(process.stdout) })() ``` ```markdown <!-- The following line should be preserved: <strong>Integrity is the ability to stand by an idea.</strong> --> But the next lines should be transformed into HTML: <strong>Civilization is the process of setting man free from men.</strong> <strong>Every building is like a person. Single and unrepeatable.</strong> ``` In the output above, the `**` in the comment is also transformed using the rule. To prevent this, the strategy is to cut comments out first using markers, then perform the transformation using the `strong` rule, and finally place the comments back into the text. ```js const { comments } = makeMarkers({ comments: /<!--([\s\S]+?)-->/g, }) const cutComments = makeCutRule(comments) const pasteComments = makePasteRule(comments) const replaceable = new Replaceable([ cutComments, strongRule, pasteComments, ]) ``` ```markdown <!-- The following line should be preserved: **Integrity is the ability to stand by an idea.** --> But the next lines should be transformed into HTML: <strong>Civilization is the process of setting man free from men.</strong> <strong>Every building is like a person. Single and unrepeatable.</strong> ``` <div align="center"><p align="center"><a href="#table-of-contents"> <img width="25" alt="section break" src="https://artdeco.gitlab.io/restream/section-breaks/15.svg"> </a></p></div> ### <code><ins>makeMarkers</ins>(</code><sub><br/>&nbsp;&nbsp;`matchers: !Object<string, !RegExp>,`<br/>&nbsp;&nbsp;`config=: !MakeMarkersConfig,`<br/></sub><code>): <i>!Object.<string, !Marker></i></code> Make markers from a configuration object. Returns an object with markers for each requested type. - <kbd><strong>matchers*</strong></kbd> <em><code>!Object&lt;string, !RegExp&gt;</code></em>: An object with types of markers to create as keys and their detection regexes as values. - <kbd>config</kbd> <em><code><a href="#type-makemarkersconfig" title="Additional configuration.">!MakeMarkersConfig</a></code></em> (optional): Additional configuration. This function will create markers from the hash of passed `matchers` object. The markers are then used to create `cut` and `paste` rules. When a `RegExp` specified for a marker is matched, the chunk will be replaced with a string. By default, the string has the `%%_RESTREAM_MARKER_NAME_REPLACEMENT_INDEX_%%` format. <table> <thead> <tr> <th> Rules (<a href="example/markers/cut.js">source</a>) </th> <th> Text after cut </th> </tr> </thead> <tbody> <tr/> <tr> <td> ```js const { comments, strong } = makeMarkers({ comments: /<!--([\s\S]+?)-->/g, strong: /\*\*(.+?)\*\*/g, }) const [cutComments, cutStrong] = [comments, strong].map(makeCutRule) const replaceable = new Replaceable([ cutComments, cutStrong, ]) ``` </td> <td> ```markdown %%_RESTREAM_COMMENTS_REPLACEMENT_0_%% But the next lines should be transformed into HTML: %%_RESTREAM_STRONG_REPLACEMENT_0_%% %%_RESTREAM_STRONG_REPLACEMENT_1_%% ``` </td> </tr> </tbody> </table> This format can be modified with the additional configuration passed as the second argument by providing a function to generate replacement strings, and their respective regular expressions to replace them back with their original values. __<a name="type-makemarkersconfig">`MakeMarkersConfig`</a>__: Additional configuration. | Name | Type | Description | | -------------- | ------------------------------------------------ | ------------------------------------------------------------------------- | | getReplacement | <em>(name: string, index: number) => string</em> | The function used to create a replacement when some text needs to be cut. | | getRegex | <em>(name: string) => !RegExp</em> | The function used to create a RegExp to detect replaced chunks. | By default, `%%_RESTREAM_${name.toUpperCase()}_REPLACEMENT_${index}_%%` replacement is used with <code>new RegExp(&#96;%%&#95;RESTREAM&#95;${name.toUpperCase()}&#95;REPLACEMENT&#95;(\\d+)&#95;%%&#96;, 'g')</code> regex to detect it and restore the original value. <div align="center"><p align="center"><a href="#table-of-contents"> <img width="25" alt="section break" src="https://artdeco.gitlab.io/restream/section-breaks/16.svg"> </a></p></div> ### <code><ins>makeCutRule</ins>(</code><sub><br/>&nbsp;&nbsp;`marker: !Marker,`<br/></sub><code>): <i>!Rule</i></code> Make a rule for initial replacement of markers. - <kbd><strong>marker*</strong></kbd> <em>`!Marker`</em>: A marker is used to cut and paste portions of text to exclude them from processing by other rules. Markers should be created using the `makeMarker` factory method that will assign their properties. Make a rule for the _Repleceable_ to cut out marked chunks so that they don't participate in further transformations. <div align="center"><p align="center"><a href="#table-of-contents"> <img width="25" alt="section break" src="https://artdeco.gitlab.io/restream/section-breaks/17.svg"> </a></p></div> ### <code><ins>makePasteRule</ins>(</code><sub><br/>&nbsp;&nbsp;`marker: !Marker,`<br/>&nbsp;&nbsp;`pipeRules=: !(Rule|Array<!Rule>),`<br/></sub><code>): <i>!Rule</i></code> Make a rule for pasting markers back. - <kbd><strong>marker*</strong></kbd> <em>`!Marker`</em>: A marker is used to cut and paste portions of text to exclude them from processing by other rules. Markers should be created using the `makeMarker` factory method that will assign their properties. - <kbd>pipeRules</kbd> <em><code>!(Rule \| Array&lt;!Rule&gt;)</code></em> (optional): Any additional rules to replace the value of the marker before pasting it. Must be synchronous. Make a rule for the _Repleceable_ to paste back chunks replaced earlier. When the `pipeRules` is given, the value of the marker will be synchronously processed before it is reinserted. _For example, given the following input_: ```markdown <a href="test_hello_world.html">Example</a> ``` _Restream_ can prevent `_` in links from being transformed into `<em>` tags, and then transform the link to prepend the `#` symbol. ```js const { a } = makeMarkers({ a: /<a\s+.+?>[\s\S]+?<\/a>/gm, }, { getReplacement(name, index) { return `RESTREAM-${name}-${index}` }, getRegex(name) { return new RegExp(`RESTREAM-${name}-(\\d+)`, 'g') }, }) const replaceable = new Replaceable([ makeCutRule(a), { re: /_(.+?)_/g, replacement(m, val) { return `<em>${val}</em>` } }, makePasteRule(a, { re: /href="(.+?)"/, replacement(m, link) { return `href="#${link}"` }, }), ]) ``` ```html <a href="#test_hello_world.html">Example</a> ``` <div align="center"><p align="center"><a href="#table-of-contents"> <img width="25" alt="section break" src="https://artdeco.gitlab.io/restream/section-breaks/18.svg"> </a></p></div> ### Accessing Replacements Sometimes, it might be necessary to access the value replaced by a marker's regular expression. In the example below, all inner code blocks are cut at first to preserve them as they are, then the _LINKS_ rule is applied to generate anchors in a text. However, it is also possible that an inner code block will form part of a link, but because it has been replaced with a marker, the link rule will not work properly. <table> <thead> <tr> <th> Rules (<a href="example/markers/links.js">source</a>) </th> <th> Input </th> </tr> </thead> <tbody> <tr/> <tr> <td> ```js const getName = (title) => { const name = title.toLowerCase() .replace(/\s+/g, '-') .replace(/[^\w-]/g, '') return name } const { code } = makeMarkers({ code: /`(.+?)`/g, }) const cutCode = makeCutRule(code) const pasteCode = makePasteRule(code) const linkRule = { re: /\[(.+?)\]\(#LINK\)/g, replacement(match, title) { const name = getName(title) return `<a name="${name}">${title}</a>` }, } const replaceable = new Replaceable([ cutCode, linkRule, pasteCode, ]) ``` </td> <td> ```markdown `a code block` `[link in a code block](#LINK)` [just link](#LINK) [`A code block` in a link](#LINK) ``` </td> </tr> <tr> <td colspan="2" align="center"><strong>Output</strong></td> </tr> <tr> <td colspan="2"> ```markdown `a code block` `[link in a code block](#LINK)` <a name="just-link">just link</a> <a name="_restream_code_replacement_2_-in-a-link">`A code block` in a link</a> ``` </td> </tr> </tbody> </table> To prevent this from happening, a check must be performed in the _LINKS_ rule replacement function to see if matched text has any inner code blocks in it. If it does, the value can be accessed and placed back for the correct generation of the link name. This is achieved with the `replace` function. ```js const getName = (title) => { const name = title.toLowerCase() .replace(/\s+/g, '-') .replace(/[^\w-]/g, '') return name } const { code } = makeMarkers({ code: /`(.+?)`/g, }) const cutCode = makeCutRule(code) const pasteCode = makePasteRule(code) const linkRule = { re: /\[(.+?)\]\(#LINK\)/g, replacement(match, title) { const realTitle = title.replace(code.regExp, (m, i) => { const val = code.map[i] return val }) const name = getName(realTitle) return `<a name="${name}">${title}</a>` }, } const replaceable = new Replaceable([ cutCode, linkRule, pasteCode, ]) ``` ```markdown `a code block` `[link in a code block](#LINK)` <a name="just-link">just link</a> <a name="a-code-block-in-a-link">`A code block` in a link</a> ``` Now, the link is generated correctly using the title with the text inside of the code block, and not its replaced marker. Also, because the _code_ marker's regex is used with `.replace`, its `lastIndex` property won't change so there's no side effects (compared to using `.exec` method of a regular expression). This simple example shows how some markers can gain access to replacements made by other markers, which can have more compress applications. <div align="center"><p align="center"><a href="#table-of-contents"> <img alt="section break" src="https://artdeco.gitlab.io/restream/section-breaks/19.svg"> </a></p></div> ## Related Packages The following relevant packages might be of interest. | Name | Description | | ------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------- | | [`catchment`](https://github.com/artdecocode/catchment) | Collect all data flowing in from the stream into memory, and provide a promise resolved when the stream finishes. | | [`pedantry`](https://github.com/artdecocode/pedantry) | Read a directory as a stream. | | [`which-stream`](https://github.com/artdecocode/which-stream) | Create or choose source and destination (including `stdout`) streams easily. | | [`spawncommand`](https://github.com/artdecocode/spawncommand) | Spawn or fork a process and return a promise resolved with `stdout` and `stderr` data when it exits. | | [`documentary`](https://github.com/artdecocode/documentary) | Transforms the markdown files to be able to insert the content of example files and their output asynchronously. | <div align="center"><p align="center"><a href="#table-of-contents"> <img alt="section break" src="https://artdeco.gitlab.io/restream/section-breaks/20.svg"> </a></p></div> ## Copyright & License [**GNU Affero General Public License v3.0**](LICENSE) Dual licensed under AGPL-3.0 and [Art Deco License](https://artdeco.legal) for Free Open Source packages. If you require a Paid version of _Restream_ so that you can distribute your software without publishing its source code, please complete [a purchase](https://luds.io/artdeco/restream). <table> <tr> <th> <a href="https://www.artd.eco"> <img width="100" src="https://gitlab.com/uploads/-/system/group/avatar/7454762/artdeco.png" alt="Art Deco"> </a> </th> <th>© <a href="https://www.artd.eco">Art Deco™</a> 2020</th> </tr> </table> <div align="center"><p align="center"><a href="#table-of-contents"> <img alt="section break" src="https://artdeco.gitlab.io/restream/section-breaks/-1.svg"> </a></p></div>