UNPKG

castor-load

Version:

Traverse a directory to build a MongoDB collection with the found files. Then it's enable to keep directory and collection synchronised.

175 lines (144 loc) 6.84 kB
# Castor Load Traverse a directory to build a MongoDB collection with the found files. Then it enables to keep directory and collection synchronised. ## Contributors * [Nicolas Thouvenin](https://github.com/touv) * [Yannick Schurter](https://github.com/nojhamster) # Installation With [npm](http://npmjs.org) do: $ npm install castor-load # Tests Use [mocha](https://github.com/visionmedia/mocha) to run the tests. $ npm install mocha $ mocha test # API Documentation ## Constructor Loader(String directory, [Object options]) Create an new object to synchronise **directory** with MongoDB collection ###Options * `connexionURI` - *string* - URL to connect to MongoDB (see [documentation](http://docs.mongodb.org/manual/reference/connection-string/), if not specified, it can look up the environment variable "MONGO_URL" ; *default : 'mongodb://localhost:27017/test/'* * `ignore` - *array* - List of paths to ignore (either [minimatch](https://github.com/isaacs/minimatch) patterns or Regex) : *default : empty* * `include` - *array* - List of node types to handle (directory and/or files) : *default : ['files']* * `collectionName` - *string* - MongoDB collection name : *default : automatic* * `concurrency` - *number* - Define how many files/documents can be processed in parallel : *default : 1* * `maxFileSize` - *string* - Maximum size of files, beyond which they will be rejected : *default : 128mb* * `delay` - *number* - Delay of file processing when the stack is full (milliseconds) : *default : 1000* * `writeConcern` - *number/string* - Write concern level used for insertions. (see [documentation](http://docs.mongodb.org/manual/reference/write-concern/)) * `watch` - *boolean* - enable tree watching after the initial synchronization. *default: true* * `dateConfig` - *date* - (Optional) arbitrary date appended to files metadata. Files whose dateConfig has changed will be resynchronized, regardless of their modification date. * `strictCompare` - *boolean* - always check file content when a change is detected, redardless of its modification date. Should be used when files can get multiple changes in a short period of time. *default : false* * `modifier` - *function(baseDoc)* - a function to modify the base document of a file upon its creation. ```javascript var options = { "connexionURI" : "mongodb://localhost:27017/test/", "ignore" : [ "**/.*", "*~", "*.sw?", "*.old", "*.bak", "**/node_modules"] }; var fr = new Loader(__dirname, options); ``` ## Loader.use([String pattern,] Function middleware) Add a middleware to be executed on either all files or those matching the given pattern. The middleware is given the document associated with the file, and a callback that can be can be called in two ways : - if the file matches a single document, then it should be called once with a potential error and the final document. - if the file must be exploded in mulitple subdocuments, then it should be called multiple times with either a subdocument or an error, and one last time without any argument when all subdocuments have been submitted. ```javascript var fr = new Loader(__dirname); /** * Just modify the documents associated with .txt files */ fr.use('**/*.txt', function (doc, submit) { doc.name = doc.basename.toUpperCase(); submit(null, doc); }); /** * Explode the documents of .csv files into multiple subdocuments */ fr.use('**/*.csv', function (doc, submit) { require('fs').readFile(doc.location, function (err, content) { content.split('\n').forEach(function (line) { var clonedDoc = {}; for (var p in doc) { clonedDoc[p] = doc[p]; } // clone the initial document clonedDoc.content = line; // add the current line as content submit(clonedDoc); // submit the subdocument }); submit(); // call when all subdocuments have been submitted }); }); ``` ## Loader.sync(Function callback) Start synchronization between the directory and the MongoDB collection. **callback** will be called after a complete analysis. Its argument is the number of files/directories that were either cancelled or (re)synchronized with the database. ```javascript var fr = new Loader(__dirname); fr.sync(function(processed) { console.log('Synchronization done, %d files were either cancelled or checked', processed); }); ``` ## Loader.submit(String filename, Function callback) Submit manually a file to the synchronization system **callback** will be called after the file analysis. Its arguments are two, an error (if exists) and a object representing the filename in the database. ## Loader.syncr.connect(Function callback) Open a MongoDB connection, or use the existing one. The callback returns a potential error object and a handle to the working collection. Use this if you want to perform some actions on the collection before you start synchronizing. ```javascript var fr = new Loader(__dirname); fr.syncr.connect(function(err, collection) { collection.ensureIndex({ 'filename': 1 }, function (err) { console.log('Added an index on filename, now starting synchronization'); fr.sync(); }); }); ``` ## Events <table> <thead> <tr> <th>Name(arguments)</th> <th>Description</th> </tr> </thead> <tr> <td>browseOver(found)</td> <td>emitted when the tree is entirely browsed, with the number of items that should be synchronized.</td> </tr> <tr> <td>watching()</td> <td>emitted when the initial synchronization is done and the watcher is ready.</td> </tr> <tr> <td>checked(err, file)</td> <td>when a file has been (re)synchronized during the initial synchronization</td> </tr> <tr> <td>cancelled(err, file)</td> <td>when an ignored file has been removed from the DB</td> </tr> <tr> <td>added(err, file)</td> <td>when a file added in the tree has been synchronized with the DB</td> </tr> <tr> <td>changed(err, file)</td> <td>when a modified file has been resynchronized with the DB</td> </tr> <tr> <td>dropped(err, file)</td> <td>when a file has been unlinked and its related documents marked as deleted</td> </tr> <tr> <td>preCheck(file)</td> <td>when a file is about to be checked. Can be emitted on initial sync, or when a file has been added or changed.</td> </tr> <tr> <td>preCancel(file)</td> <td>when a file is about to be cancelled. Can be emitted on initial sync, or when a file has been added or changed.</td> </tr> <tr> <td>preDrop(file)</td> <td>when a file is about to be marked as deleted</td> </tr> <tr> <td>saved(document)</td> <td>when a document is saved into the database (either inserted or updated)</td> </tr> </tr> <tr> <td>loadError(err, file, linenumber)</td> <td>when an error occurs in the loader itself.</td> </tr> </table>