batch-cluster
Version:
Manage a cluster of child processes
91 lines (65 loc) • 4.72 kB
Markdown
# batch-cluster
**Efficient, concurrent work via batch-mode command-line tools from within Node.js.**
[](https://www.npmjs.com/package/batch-cluster)
[](https://github.com/photostructure/batch-cluster.js/actions/workflows/node.js.yml)
[](https://github.com/photostructure/batch-cluster.js/issues)
[](https://github.com/photostructure/batch-cluster.js/actions/workflows/codeql-analysis.yml)
[](https://snyk.io/test/github/photostructure/batch-cluster.js?targetFile=package.json)
Many command line tools, like
[ExifTool](https://sno.phy.queensu.ca/~phil/exiftool/),
[PowerShell](https://github.com/powershell/powershell), and
[GraphicsMagick](http://www.graphicsmagick.org/), support running in a "batch
mode" that accept a series of discrete commands provided through stdin and
results through stdout. As these tools can be fairly large, spinning them up can
be expensive (especially on Windows).
This module allows you to run a series of commands, or `Task`s, processed by a
cluster of these processes.
This module manages both a queue of pending tasks, feeding processes pending
tasks when they are idle, as well as monitoring the child processes for errors
and crashes. Batch processes are also recycled after processing N tasks or
running for N seconds, in an effort to minimize the impact of any potential
memory leaks.
As of version 4, retry logic for tasks is a separate concern from this module.
This package powers [exiftool-vendored](https://photostructure.github.io/exiftool-vendored.js/),
whose source you can examine as an example consumer.
## Installation
```bash
$ npm install --save batch-cluster
```
## Changelog
See [CHANGELOG.md](https://github.com/photostructure/batch-cluster.js/blob/main/CHANGELOG.md).
## Usage
The child process must use `stdin` and `stdout` for control/response.
BatchCluster will ensure a given process is only given one task at a time.
1. Create a singleton instance of
[BatchCluster](https://photostructure.github.io/batch-cluster.js/classes/BatchCluster.html).
Note the [constructor
options](https://photostructure.github.io/batch-cluster.js/classes/BatchCluster.html#constructor)
takes a union type of
- [ChildProcessFactory](https://photostructure.github.io/batch-cluster.js/interfaces/ChildProcessFactory.html)
and
- [BatchProcessOptions](https://photostructure.github.io/batch-cluster.js/interfaces/BatchProcessOptions.html),
both of which have no defaults, and
- [BatchClusterOptions](https://photostructure.github.io/batch-cluster.js/classes/BatchClusterOptions.html),
which has defaults that may or may not be relevant to your application.
1. The [default logger](https://photostructure.github.io/batch-cluster.js/interfaces/Logger.html)
writes warning and error messages to `console.warn` and `console.error`. You
can change this to your logger by using
[setLogger](https://photostructure.github.io/batch-cluster.js/modules.html#setLogger) or by providing a logger to the `BatchCluster` constructor.
1. Implement the [Parser](https://photostructure.github.io/batch-cluster.js/interfaces/Parser.html)
class to parse results from your child process.
1. Construct or extend the
[Task](https://photostructure.github.io/batch-cluster.js/classes/Task.html)
class with the desired command and the parser you built in the previous
step, and submit it to your BatchCluster's
[enqueueTask](https://photostructure.github.io/batch-cluster.js/classes/BatchCluster.html#enqueueTask)
method.
See
[src/test.ts](https://github.com/photostructure/batch-cluster.js/blob/main/src/test.ts)
for an example child process. Note that the script is _designed_ to be flaky on
order to test BatchCluster's retry and error handling code.
## Caution
The default `BatchClusterOptions.cleanupChildProcs` value of `true` means that BatchCluster will try to use `ps` to ensure Node's view of process state are correct, and that errant
processes are cleaned up.
If you run this in a docker image based off Alpine or Debian Slim, **this won't work properly unless you install the `procps` package.**
[See issue #13 for details.](https://github.com/photostructure/batch-cluster.js/issues/13)