castor-load
Version:
Traverse a directory to build a MongoDB collection with the found files. Then it's enable to keep directory and collection synchronised.
175 lines (144 loc) • 6.84 kB
Markdown
Traverse a directory to build a MongoDB collection with the found files. Then it enables to keep directory and collection synchronised.
* [Nicolas Thouvenin](https://github.com/touv)
* [Yannick Schurter](https://github.com/nojhamster)
With [npm](http://npmjs.org) do:
$ npm install castor-load
Use [mocha](https://github.com/visionmedia/mocha) to run the tests.
$ npm install mocha
$ mocha test
Create an new object to synchronise **directory** with MongoDB collection
* `connexionURI` - *string* - URL to connect to MongoDB (see [documentation](http://docs.mongodb.org/manual/reference/connection-string/), if not specified, it can look up the environment variable "MONGO_URL" ; *default : 'mongodb://localhost:27017/test/'*
* `ignore` - *array* - List of paths to ignore (either [minimatch](https://github.com/isaacs/minimatch) patterns or Regex) : *default : empty*
* `include` - *array* - List of node types to handle (directory and/or files) : *default : ['files']*
* `collectionName` - *string* - MongoDB collection name : *default : automatic*
* `concurrency` - *number* - Define how many files/documents can be processed in parallel : *default : 1*
* `maxFileSize` - *string* - Maximum size of files, beyond which they will be rejected : *default : 128mb*
* `delay` - *number* - Delay of file processing when the stack is full (milliseconds) : *default : 1000*
* `writeConcern` - *number/string* - Write concern level used for insertions. (see [documentation](http://docs.mongodb.org/manual/reference/write-concern/))
* `watch` - *boolean* - enable tree watching after the initial synchronization. *default: true*
* `dateConfig` - *date* - (Optional) arbitrary date appended to files metadata. Files whose dateConfig has changed will be resynchronized, regardless of their modification date.
* `strictCompare` - *boolean* - always check file content when a change is detected, redardless of its modification date. Should be used when files can get multiple changes in a short period of time. *default : false*
* `modifier` - *function(baseDoc)* - a function to modify the base document of a file upon its creation.
```javascript
var options = {
"connexionURI" : "mongodb://localhost:27017/test/",
"ignore" : [ "**/.*", "*~", "*.sw?", "*.old", "*.bak", "**/node_modules"]
};
var fr = new Loader(__dirname, options);
```
Add a middleware to be executed on either all files or those matching the given pattern. The middleware is given the document associated with the file, and a callback that can be can be called in two ways :
- if the file matches a single document, then it should be called once with a potential error and the final document.
- if the file must be exploded in mulitple subdocuments, then it should be called multiple times with either a subdocument or an error, and one last time without any argument when all subdocuments have been submitted.
```javascript
var fr = new Loader(__dirname);
/**
* Just modify the documents associated with .txt files
*/
fr.use('**/*.txt', function (doc, submit) {
doc.name = doc.basename.toUpperCase();
submit(null, doc);
});
/**
* Explode the documents of .csv files into multiple subdocuments
*/
fr.use('**/*.csv', function (doc, submit) {
require('fs').readFile(doc.location, function (err, content) {
content.split('\n').forEach(function (line) {
var clonedDoc = {};
for (var p in doc) { clonedDoc[p] = doc[p]; } // clone the initial document
clonedDoc.content = line; // add the current line as content
submit(clonedDoc); // submit the subdocument
});
submit(); // call when all subdocuments have been submitted
});
});
```
Start synchronization between the directory and the MongoDB collection.
**callback** will be called after a complete analysis. Its argument is the number of files/directories that were either cancelled or (re)synchronized with the database.
```javascript
var fr = new Loader(__dirname);
fr.sync(function(processed) {
console.log('Synchronization done, %d files were either cancelled or checked', processed);
});
```
Submit manually a file to the synchronization system
**callback** will be called after the file analysis. Its arguments are two, an error (if exists) and a object representing the filename in the database.
## Loader.syncr.connect(Function callback)
Open a MongoDB connection, or use the existing one. The callback returns a potential error object and a handle to the working collection. Use this if you want to perform some actions on the collection before you start synchronizing.
```javascript
var fr = new Loader(__dirname);
fr.syncr.connect(function(err, collection) {
collection.ensureIndex({ 'filename': 1 }, function (err) {
console.log('Added an index on filename, now starting synchronization');
fr.sync();
});
});
```
<table>
<thead>
<tr>
<th>Name(arguments)</th>
<th>Description</th>
</tr>
</thead>
<tr>
<td>browseOver(found)</td>
<td>emitted when the tree is entirely browsed, with the number of items that should be synchronized.</td>
</tr>
<tr>
<td>watching()</td>
<td>emitted when the initial synchronization is done and the watcher is ready.</td>
</tr>
<tr>
<td>checked(err, file)</td>
<td>when a file has been (re)synchronized during the initial synchronization</td>
</tr>
<tr>
<td>cancelled(err, file)</td>
<td>when an ignored file has been removed from the DB</td>
</tr>
<tr>
<td>added(err, file)</td>
<td>when a file added in the tree has been synchronized with the DB</td>
</tr>
<tr>
<td>changed(err, file)</td>
<td>when a modified file has been resynchronized with the DB</td>
</tr>
<tr>
<td>dropped(err, file)</td>
<td>when a file has been unlinked and its related documents marked as deleted</td>
</tr>
<tr>
<td>preCheck(file)</td>
<td>when a file is about to be checked. Can be emitted on initial sync, or when a file has been added or changed.</td>
</tr>
<tr>
<td>preCancel(file)</td>
<td>when a file is about to be cancelled. Can be emitted on initial sync, or when a file has been added or changed.</td>
</tr>
<tr>
<td>preDrop(file)</td>
<td>when a file is about to be marked as deleted</td>
</tr>
<tr>
<td>saved(document)</td>
<td>when a document is saved into the database (either inserted or updated)</td>
</tr>
</tr>
<tr>
<td>loadError(err, file, linenumber)</td>
<td>when an error occurs in the loader itself.</td>
</tr>
</table>