cluster-master-ext

Version:

A module for taking advantage of the built-in `cluster` module in node v0.8+, enables rolling worker restarts, resizing, repl, events, configurable timeouts, debug method. Zero downtime deploy of workers. Extended version of cluster-master

github.com/jeffbski/cluster-master

jeffbski/cluster-master

268 lines (206 loc) • 8.83 kB

Markdown

# cluster-master-ext A module for taking advantage of the built-in `cluster` module in node v0.8+, enables rolling worker restarts, resizing, repl, events, configurable timeouts, debug method. Modified from Isaac's original version `cluster-master` adding: - events - repl config, help, docs - configurable timeouts - exports `debug` method for ability to write to all REPLs and console Note: I had provided these changes as pull requests back to the original author, but after waiting for 10 months, I will now provide this as an alternative module Your main `server.js` file uses this module to fire up a cluster of workers. Those workers then do the actual server stuff (using socket.io, express, tako, raw node, whatever; any TCP/TLS/HTTP/HTTPS server would work.) This module provides some basic functionality to keep a server running. As the name implies, it should only be run in the master module, not in any cluster workers. ```javascript var clusterMaster = require("cluster-master-ext") // most basic usage: just specify the worker // Spins up as many workers as you have CPUs // // Note that this is VERY WRONG for a lot of multi-tenanted // VPS environments where you may have 32 CPUs but only a // 256MB RSS cap or something. ie. specify the size to // have what makes sense clusterMaster("worker.js") // more advanced usage. Specify configs. // in real life, you can only actually call clusterMaster() once. clusterMaster({ exec: "worker.js" // script to run , size: 5 // number of workers , env: { SOME: "environment_vars" } , args: [ "--deep", "doop" ] , silent: true , signals: false , onMessage: function (msg) { console.error("Message from %s %j" , this.uniqueID , msg) } }) // methods clusterMaster.resize(10) // graceful rolling restart clusterMaster.restart() // graceful shutdown clusterMaster.quit() // not so graceful shutdown clusterMaster.quitHard() // listen to events to additional cleanup or shutdown clusterMaster.emitter() .on('resize', function (clusterSize) { }) .on('restart', function () { }) .on('quit', function () { }) .on('quitHard', function () { }); ``` ## Install Use from github or via npm ```bash npm install cluster-master-ext ``` ## Methods ### clusterMaster.resize(n) Set the cluster size to `n`. This will disconnect extra nodes and/or spin up new nodes, as needed. Done by default on restarts. Fires `resize` event with new clusterSize just before performing the resize. ### clusterMaster.restart(cb) One by one, shut down nodes and spin up new ones. Callback is called when finished. Fires `restart` event just before performing restart. ### clusterMaster.quit() Gracefully shut down the worker nodes and then process.exit(0). Fires `quit` event just before performing the shutdown. ### clusterMaster.quitHard() Forcibly shut down the worker nodes and then process.exit(1). Fires `quitHard` event just before performing hard shut down. ### clusterMaster.emitter() Retrieve the clusterMaster EventEmitter to be able to listen to clusterMaster events. This emitter is also returned from the original clusterMaster() constructor. ### clusterMaster.debug(arg1, arg2, ...) Arguments passed to debug are formatted with util.format and output to stdout and any REPL's. ```javascript clusterMaster.debug('The number one is %s', 1); ``` ## Configs The `exec`, `env`, `argv`, and `silent` configs are passed to the `cluster.fork()` call directly, and have the same meaning. * `exec` - The worker script to run * `env` - Envs to provide to workers * `argv` - Additional args to pass to workers. * `silent` - Boolean, default=false. Do not share stdout/stderr * `size` - Starting cluster size. Default = CPU count * `signals` - Boolean, default=true. Set up listeners to: * `SIGHUP` - restart * `SIGINT` - quit (control-c) * `SIGABRT` - quitHard * `onMessage` - Method that gets called when workers send a message to the parent. Called in the context of the worker, so you can reply by looking at `this`. * `stopTimeout` - Time in milliseconds to wait for worker to stop before forcefully killing the process during restart or resize, default 5000 (5 seconds) * `skepticTimeout` - Time in milliseconds to wait for worker to live before shutting previous worker down during restart, default 2000 (2 seconds) * `silenceDebug` - if true, then silences the normal console debug messages, default false (output will still continue to repls regardless) * `aliveEvent` - the cluster event to wait for to consider the child process to be alive, set to `online` for non http workers, default `listening` * `replHelp` - Array of additional text lines to add to repl `help` command ```javascript var config = { replHelp: [ 'process - access node.js process', '.break - interrupt current command' ] }; ``` * `replContext` - Object of additional properties or functions to add to the REPL context ```javascript var config = { replContext: { foo: fooObject // adds foo to the REPL which exposes the fooObject } }; ``` * `replHelp` - Array of additional text lines to add to repl `help` command ```javascript var config = { replHelp: [ 'process - access node.js process', '.break - interrupt current command' ] }; ``` * `replContext` - Object of additional properties or functions to add to the REPL context ```javascript var config = { replContext: { foo: fooObject // adds foo to the REPL which exposes the fooObject } }; ``` * `repl` - where to have REPL listen, defaults to `env.CLUSTER_MASTER_REPL` || 'cluster-master-socket' * if `repl` is null or false - REPL is disabled and will not be started * if `repl` is string path - REPL will listen on unix domain socket to this path * if `repl` is an integer port - REPL will listen on TCP 0.0.0.0:port * if `repl` is an object with `address` and `port`, then REPL will listen on TCP address:PORT Examples of configuring `repl` ```javascript var config = { repl: false } // disable REPL var config = { repl: '/tmp/cluster-master-sock' } // unix domain socket var config = { repl: 3001 } // tcp socket 0.0.0.0:3001 var config = { repl: { address: '127.0.0.1', port: 3002 }} // tcp 127.0.0.1:3002 ``` Note: be careful when using TCP for your REPL since anyone on the network can connect to your REPL (no security). So either disable the REPL or use a unix domain socket which requires local access (or ssh access) to the server. ## REPL Cluster-master provides a REPL into the master process so you can inspect the state of your cluster. By default the REPL is accessible by a socket written to the root of the directory, but you can override it with the `CLUSTER_MASTER_REPL` environment variable. You can access the REPL with nc or [socat](http://www.dest-unreach.org/socat/) like so: ```bash nc -U ./cluster-master-socket # OR socat ./cluster-master-socket stdin ``` The REPL provides you with access to these objects or functions: * `help` - display these commands * `repl` - access the REPL * `resize(n)` - resize the cluster to `n` workers * `restart(cb)` - gracefully restart workers, cb is optional * `stop()` - gracefully stop workers and master * `kill()` - forcefully kill workers and master * `cluster` - node.js cluster module * `size` - current cluster size * `connections` - number of REPL connections to master * `workers` - current workers * `select(fld)` - map of id to `field` (from workers) * `pids` - map of id to pids * `ages` - map of id to worker ages * `states` - map of id to worker states * `debug(a1)` - output `a1` to stdout and all REPLs * `sock` - this REPL socket' * `.exit` - close this connection to the REPL ## Events clusterMaster emits events on clusterMaster.emitter() when its methods are called which allows you to respond and do additional cleanup right before the action is carried out. * `debug` - fired when debug() is called to output messages, listener ex: `fn(msg, args, ...)` * `disconnect` - fired before worker is to be disconnected, listener ex: `fn(worker)` * `resize` - fired on clusterMaster.resize(n), listener ex: `fn(clusterSize)` * `restart` - fired on clusterMaster.restart(), listener ex: `fn(oldWorkers)` `restartComplete` - fired when restart is completed * `quit` - fired on clusterMaster.quit() * `quitHard` - fired on clusterMaster.quitHard() ## LICENSE BSD ## Authors and Contributors - Isaac Z. Schlueter, author of the original project `cluster-master` - Jeff Barczewski - Sean McCullough