UNPKG

hyperlog-index

Version:
196 lines (145 loc) 5.41 kB
# hyperlog-index Forking indexes for [hyperlog](https://npmjs.com/package/hyperlog) Built on the map/reduce pattern; hyperlog-index will call a map function on every hyperlog insert, building the index incrementally. # example ## forking key/value store Using hyperlog-index, we can easily build a key/value store backed to a hyperlog that implements a [multi-value register conflict strategy](https://en.wikipedia.org/wiki/Conflict-free_replicated_data_type#Others): ``` js var level = require('level') var indexer = require('hyperlog-index') var hyperlog = require('hyperlog') var sub = require('subleveldown') var mkdirp = require('mkdirp') var minimist = require('minimist') var argv = minimist(process.argv.slice(2), { default: { d: '/tmp/kv.db' } }) mkdirp.sync(argv.d) var hdb = level(argv.d + '/h') var idb = level(argv.d + '/i') var log = hyperlog(hdb, { valueEncoding: 'json' }) var db = sub(idb, 'x', { valueEncoding: 'json' }) var dex = indexer({ log: log, db: sub(idb, 'i'), map: function (row, next) { // This method reduces our new state. In this example, db is used for the state. db.get(row.value.k, function (err, doc) { if (!doc) doc = {} row.links.forEach(function (link) { delete doc[link] }) doc[row.key] = row.value.v db.put(row.value.k, doc, next) }) } }) if (argv._[0] === 'get') { dex.ready(function () { db.get(argv._[1], function (err, values) { if (err) console.error(err) else console.log(values) }) }) } else if (argv._[0] === 'put') { // Structure `doc` as expected by `map` above var doc = { k: argv._[1], v: argv._[2] } dex.ready(function () { db.get(doc.k, function (err, values) { // Link the new entry to the "parents", from the current index, if any log.add(Object.keys(values || {}), doc, function (err, node) { if (err) console.error(err) }) }) }) } else if (argv._[0] === 'sync') { var r = log.replicate() process.stdin.pipe(r).pipe(process.stdout) r.on('end', function () { process.stdin.pause() }) } ``` Each key maps to an object of hashes to values: ``` $ node kv.js -d /tmp/db1 put A beep $ node kv.js -d /tmp/db1 put A boop $ node kv.js -d /tmp/db1 get A { '06e4130fc5f2392cb8bdb065d18eaa523d716f2c61b4877853340a5cc727fb42': 'boop' } ``` Meanwhile, a second database may have additional edits: ``` $ node kv.js -d /tmp/db2 put A whatever $ node kv.js -d /tmp/db2 put B hey ``` When these two databases are merged together, the key at `A` has two values: ``` $ dupsh 'node kv.js -d /tmp/db1 sync' 'node kv.js -d /tmp/db2 sync' $ node kv.js -d /tmp/db1 get A { '06e4130fc5f2392cb8bdb065d18eaa523d716f2c61b4877853340a5cc727fb42': 'boop', cba756b45e279ae5c3f3ebc8cfe0d50e1f2205e37a4443ce9e0e5a41491c234c: 'whatever' } ``` The `B` key has only a single element: ``` $ node kv.js -d /tmp/db1 get B { '53a374617fb8839b6f19646d6658188a4fc08d19f35c084dab835847532a3468': 'hey' } ``` This is because `put` does the linking of new nodes to old ones, which is not done in merge. New updates that link at both existing keys will merge into a single key: ``` $ node kv.js -d /tmp/db1 put A whatboop $ node kv.js -d /tmp/db1 get A { '85915730b3e7a4f715057e74af79b564a5be2ec14d334d344cb84d1544ec6107': 'whatboop' } ``` and these merges can be communicated over replication: ``` $ dupsh 'node kv.js -d /tmp/db1 sync' 'node kv.js -d /tmp/db2 sync' $ node kv.js -d /tmp/db2 get A { '85915730b3e7a4f715057e74af79b564a5be2ec14d334d344cb84d1544ec6107': 'whatboop' } ``` And the index can be destroyed (and recalculated) at any time: ``` $ rm -rf /tmp/db1/i $ node kv.js -d /tmp/db1 get A { '85915730b3e7a4f715057e74af79b564a5be2ec14d334d344cb84d1544ec6107': 'whatboop' } ``` This is a useful strategy when you need to update the code in your indexes. **Note:** If you run the included example, the value is assumed to be a json object. The command line `put` format will be more like this: ``` $ node example/kv.js -d /tmp/db1 put A '{"baap":"boop"}' ``` **Note:** If you are primarily interested in a key/value index, like in this example - check out [hyperkv](https://www.npmjs.com/package/hyperkv) # api ``` js var indexer = require('hyperlog-index') ``` ## var dex = indexer(opts) Create a new hyperlog index instance `dex` from: * `opts.log` - a hyperlog instance (required) * `opts.db` - a level instance (required) * `opts.map` - an indexing function `function (row, next) {}` You can have as many indexes as you like on the same log, just create more `dex` instances on sublevels. ## opts.map(row, next) The indexing function `fn` runs for each `row`. The indexing function should write its computed indexes to durable storage and call `next(err)` when it is finished. ## dex.ready(fn) Registers the callback `fn()` to fire when the indexes have "caught up" to the latest known change in the hyperlog. The `fn()` function fires exactly once. You may call `dex.ready()` multiple times with different functions. ## dex.pause() Pause calculating the indexes. `dex.ready()` will not fire until the indexes have been resumed. ## dex.resume() Resume calculation of the indexes after `dex.pause()`. ## dex.on('error', function (err) {}) If the underlying system generates an error, you can catch it here. # install With [npm](https://npmjs.org) do: ``` npm install hyperlog-index ``` # license MIT