node-nlp
Version:
Library for NLU (Natural Language Understanding) done in Node.js
283 lines (200 loc) • 10.2 kB
Markdown
Chrono
======
A natural language date parser in Javascript, designed for extracting date information from any given text. (Java version is also available [here](https://github.com/wanasit/chrono-java))
[](https://travis-ci.org/wanasit/chrono)
[](https://coveralls.io/r/wanasit/chrono?branch=master)
Chrono supports most date and time formats, such as :
* Today, Tomorrow, Yesterday, Last Friday, etc
* 17 August 2013 - 19 August 2013
* This Friday from 13:00 - 16.00
* 5 days ago
* 2 weeks from now
* Sat Aug 17 2013 18:40:39 GMT+0900 (JST)
* 2014-11-30T08:15:30-05:30
## Install
#### npm (recommended)
Just run:
$ npm i --save chrono-node
And start using chrono:
var chrono = require('chrono-node')
chrono.parseDate('An appointment on Sep 12-13')
#### Bower
Prefer bower? You can do that, too:
Just run:
$ bower install chrono
And use:
```html
<script src="bower_components/chrono/chrono.min.js"></script>
<script>chrono.parseDate('An appointment on Sep 12-13')</script>
```
#### Other Options:
Doing something else? No worries. Try these:
Platform | Installation
---------|----
CDN | Via [jsDelivr]:<br> `<script src="https://cdn.jsdelivr.net/chrono/VERSION/chrono.min.js"></script>`
Rails | Install from [Rails Assets] by adding this to your Gemfile:<br> `gem 'rails-assets-chrono', source: 'https://rails-assets.org'`
Swift | Try using the community-made [chrono-swift] wrapper.
[Rails Assets]: https://rails-assets.org/
[jsDelivr]: https://www.jsdelivr.com/projects/chrono
[chrono-swift]: https://github.com/neilsardesai/chrono-swift
#### Browserify
Chrono's modules are linked and packaged using [Browserify](http://browserify.org) on `src/chrono.js`. By default, `chrono.js` file exports `chrono` object as a window global.
```
browserify src/chrono.js --s chrono -o chrono.js
```
## Usage
Simply pass a string to function `chrono.parseDate` or `chrono.parse`.
```javascript
> var chrono = require('chrono-node')
> chrono.parseDate('An appointment on Sep 12-13')
Fri Sep 12 2014 12:00:00 GMT-0500 (CDT)
> chrono.parse('An appointment on Sep 12-13');
[ { index: 18,
text: 'Sep 12-13',
tags: { ENMonthNameMiddleEndianParser: true },
start:
{ knownValues: [Object],
impliedValues: [Object] },
end:
{ knownValues: [Object],
impliedValues: [Object] } } ]
```
### Reference Date
Today's "Friday" is different from last month's "Friday".
The meaning of the referenced dates depends on when they are mentioned.
Chrono lets you define a reference date using `chrono.parse(text, ref)` and `chrono.parseDate(text, ref)`.
```javascript
> chrono.parseDate('Friday', new Date(2012,7,23));
Fri Aug 24 2012 12:00:00 GMT+0700 (ICT)
> chrono.parseDate('Friday', new Date(2012,7,1));
Fri Aug 03 2012 12:00:00 GMT+0700 (ICT)
```
### Detailed Parsed Results
The function `chrono.parse` returns detailed parsing results as objects of class `chrono.ParsedResult`.
```javascript
var results = chrono.parse('I have an appointment tomorrow from 10 to 11 AM')
results[0].index // 15
results[0].text // 'tomorrow from 10 to 11 AM'
results[0].ref // Sat Dec 13 2014 21:50:14 GMT-0600 (CST)
results[0].start.date() // Sun Dec 14 2014 10:00:00 GMT-0600 (CST)
results[0].end.date() // Sun Dec 14 2014 11:00:00 GMT-0600 (CST)
```
#### ParsedResult
* `start` The parsed date components as a [ParsedComponents](#parsedcomponents) object
* `end` Similar to `start` but can be null.
* `index` The location within the input text of this result
* `text` The text this result that appears in the input
* `ref` The [reference date](#reference-date) of this result
#### ParsedComponents
A group of found date and time components (year, month, hour, etc). ParsedComponents objects consist of `knownValues` and `impliedValues`.
* `assign(component, value)` Set known value to the component
* `imply(component, value)` Set implied value to the component
* `get(component)` Get known or implied value for the component
* `isCertain(component)` return true if the value of the component is known.
* `date()` Create a javascript Date
```javascript
// Remove the timezone offset of a parsed date and then create the Date object
> var results = new chrono.parse('2016-03-08T01:16:07+02:00'); // Create new ParsedResult Object
> results[0].start.assign('timezoneOffset', 0); // Change value in ParsedComponents Object 'start'
> var d = results[0].start.date(); // Create a Date object
> d.toString(); // Display resulting Date object
'Tue Mar 08 2016 01:16:07 GMT+0000 (GMT)'
```
### Strict vs Casual
Chrono comes with `strict` mode that parse only formal date patterns.
```javascript
// 'strict' mode
chrono.strict.parseDate('Today'); // null
chrono.strict.parseDate('Friday'); // null
chrono.strict.parseDate('2016-07-01'); // Fri Jul 01 2016 12:00:00 ...
chrono.strict.parseDate('Jul 01 2016'); // Fri Jul 01 2016 12:00:00 ...
// 'casual' mode (default)
chrono.parseDate('Today'); // Thu Jun 30 2016 12:00:00 ...
chrono.casual.parseDate('Friday'); // Fri Jul 01 2016 12:00:00 ...
chrono.casual.parseDate('Jul 01 2016'); // Fri Jul 01 2016 12:00:00 ...
chrono.casual.parseDate('Friday'); // Fri Jul 01 2016 12:00:00 ...
```
## Customize Chrono
Chrono’s extraction pipeline are mainly separated into 'parse' and ‘refine’ phases. During parsing, ‘parsers’ (`Parser`) are used to extract patterns from the input text. The parsed results ([ParsedResult](#parsedresult)) are the combined, sorted, then refine using ‘refiners’ (`Refiner`). In the refining phase, the results can be combined, filtered-out, or attached with additional information.
### Parser
Parser is a module for low-level pattern-based parsing. Ideally, each parser should be designed to handle a single specific date format. User can add new type of parsers for supporting new date formats or languages.
```javascript
var christmasParser = new chrono.Parser();
// Provide search pattern
christmasParser.pattern = function () { return /Christmas/i }
// This function will be called when matched pattern is found
christmasParser.extract = function(text, ref, match, opt) {
// Return a parsed result, that is 25 December
return new chrono.ParsedResult({
ref: ref,
text: match[0],
index: match.index,
start: {
day: 25,
month: 12,
}
});
}
// Create a new custom Chrono. The initial pipeline 'option' can also be specified as
// - new chrono.Chrono(exports.options.strictOption())
// - new chrono.Chrono(exports.options.casualOption())
var custom = new chrono.Chrono();
custom.parsers.push(christmasParser);
custom.parseDate("I'll arrive at 2.30AM on Christmas night")
// Wed Dec 25 2013 02:30:00 GMT+0900 (JST)
```
To create a custom parser, override `pattern` and `extract` methods on an object of class `chrono.Parser`.
* The `pattern` method must return `RegExp` object of searching pattern.
* The `extract` method will be called with the
[match](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/exec) object when the pattern is found. This function must create and return a [result](#parsedresult) (or null to skip).
### Refiner
Refiner is a higher level module for improving or manipulating the results. User can add a new type of refiner to customize Chrono's results or to add some custom logic to Chrono.
```javascript
var guessPMRefiner = new chrono.Refiner();
guessPMRefiner.refine = function(text, results, opt) {
// If there is no AM/PM (meridiem) specified,
// let all time between 1:00 - 4:00 be PM (13.00 - 16.00)
results.forEach(function (result) {
if (!result.start.isCertain('meridiem')
&& result.start.get('hour') >= 1 && result.start.get('hour') < 4) {
result.start.assign('meridiem', 1);
result.start.assign('hour', result.start.get('hour') + 12);
}
});
return results;
}
// Create a new custom Chrono. The initial pipeline 'option' can also be specified as
// - new chrono.Chrono(exports.options.strictOption())
// - new chrono.Chrono(exports.options.casualOption())
var custom = new chrono.Chrono();
custom.refiners.push(guessPMRefiner);
// This will be parsed as PM.
// > Tue Dec 16 2014 14:30:00 GMT-0600 (CST)
custom.parseDate("This is at 2.30");
// Unless the 'AM' part is specified
// > Tue Dec 16 2014 02:30:00 GMT-0600 (CST)
custom.parseDate("This is at 2.30 AM");
```
In the example, a custom refiner is created for assigning PM to parsing results with ambiguous [meridiem](http://en.wikipedia.org/wiki/12-hour_clock). The `refine` method of the refiner class will be called with parsing [results](#parsedresult) (from [parsers](#parser) or other previous refiners). The method must return an array of the new results (which, in this case, we modified those results in place).
## Development Guides
This guide explains how to setup chrono project for prospective contributors.
```bash
# Clone and install library
git clone https://github.com/wanasit/chrono.git chrono
cd chrono
npm install
# Try running the test
npm run test
```
Chrono's source files is in `src` directory. The built bundle (`chrono.js` and `chrono.min.js`) can be built by [Browserify](http://browserify.org) on `src/chrono.js` using the following command
```
npm run make
```
Parsing date from text is complicated. Sometimes, a small change can have effects on unexpected places. So, Chrono is a heavily tested library. Commits that break a test shouldn't be allowed in any condition.
Chrono's unit testing is based-on [Qunit](https://qunitjs.com/) and [Karma](https://github.com/karma-runner/karma). During the developement, I recommend running Karma test together with watchify.
```
# Start karma
npm run karma
# Start watch (run on a different terminal)
npm run watch
```