node-darts
Version:
Node.js Native Addon for Darts (Double-ARray Trie System)
316 lines (225 loc) • 12.5 kB
Markdown
# node-darts
Node.js Native Addon for Darts (Double-ARray Trie System)
[](https://badge.fury.io/js/node-darts)
[](https://opensource.org/licenses/MIT)
[](https://github.com/libraz/node-darts/actions/workflows/ci.yml)
[](https://codecov.io/gh/libraz/node-darts)
## Overview
`node-darts` is a Node.js native addon that provides bindings to the C++ Darts (Double-ARray Trie System) library. It allows you to use `.darts` dictionary files in Node.js/TypeScript environments with high performance.
## Features
- Load and use `.darts` dictionary files created by Perl's `Text::Darts`
- Build dictionaries from key-value pairs
- Fast exact match search and common prefix search
- Text replacement using dictionary words
- Traverse the trie structure
- Asynchronous and synchronous APIs
- TypeScript support
- ESModule and CommonJS support
- Class-based interface similar to Perl's `Text::Darts`
## Installation
```bash
npm install node-darts
# or
yarn add node-darts
```
## Requirements
- Node.js v20.0.0 or later
- C++ compiler with C++17 support
### Windows-specific Requirements
If you're on Windows, you'll need:
- Visual Studio Build Tools with C++ workload
- Python 2.7 or 3.x
The package will attempt to install these dependencies automatically during installation if they're missing. If automatic installation fails, you can install them manually:
1. Install [Visual Studio Build Tools](https://visualstudio.microsoft.com/visual-cpp-build-tools/) with the "Desktop development with C++" workload
2. Install [Python](https://www.python.org/downloads/) (2.7 or 3.x)
3. Set the npm config: `npm config set msvs_version 2019`
## Basic Usage
```javascript
// Using ESM
import { loadDictionary, TextDarts } from 'node-darts';
// Using CommonJS
// const { loadDictionary, TextDarts } = require('node-darts');
// Load an existing dictionary file
const dict = loadDictionary('/path/to/dictionary.darts');
// Text replacement using the loaded dictionary
const text = 'I like apple and pineapple for breakfast.';
const replaced = dict.replaceWords(text, (word) => `<b>${word}</b>`);
console.log(replaced); // "I like <b>apple</b> and <b>pineapple</b> for breakfast."
// You can also use an object mapping for replacement
const mapping = {
apple: '🍎',
pineapple: '🍍',
};
const replaced2 = dict.replaceWords(text, mapping);
console.log(replaced2); // "I like 🍎 and 🍍 for breakfast."
// Exact match search
console.log(dict.exactMatchSearch('apple')); // Returns the value if found
console.log(dict.exactMatchSearch('grape')); // -1 (not found)
// Common prefix search
const results = dict.commonPrefixSearch('pineapple');
console.log(results); // Array of found values
// Clean up resources
dict.dispose();
// Alternative: Using TextDarts class
const darts = TextDarts.load('/path/to/dictionary.darts');
const replaced3 = darts.replaceWords(text, mapping);
console.log(replaced3); // "I like 🍎 and 🍍 for breakfast."
darts.dispose();
```
## Creating a Dictionary
If you don't have a dictionary file yet, you can create one:
```javascript
import { buildDictionary, buildAndSaveDictionary } from 'node-darts';
// Create a dictionary from keys and values
const keys = ['apple', 'banana', 'orange', 'pineapple', 'strawberry'];
const values = [100, 200, 300, 400, 500];
// Build in memory
const dict = buildDictionary(keys, values);
// Or build and save to a file
await buildAndSaveDictionary(keys, '/path/to/output.darts', values);
```
## API Reference
### Dictionary Class
- `exactMatchSearch(key: string): number` - Performs an exact match search
- `commonPrefixSearch(key: string): number[]` - Performs a common prefix search
- `replaceWords(text: string, replacer: WordReplacer): string` - Searches for dictionary words in a text and replaces them
- `traverse(key: string, callback: TraverseCallback): void` - Traverses the trie
- `load(filePath: string): Promise<boolean>` - Loads a dictionary file asynchronously
- `loadSync(filePath: string): boolean` - Loads a dictionary file synchronously
- `size(): number` - Gets the size of the dictionary
- `dispose(): void` - Releases resources
### Builder Class
- `build(keys: string[], values?: number[], options?: BuildOptions): Dictionary` - Builds a Double-Array
- `buildAndSave(keys: string[], filePath: string, values?: number[], options?: BuildOptions): Promise<boolean>` - Builds and saves asynchronously
- `buildAndSaveSync(keys: string[], filePath: string, values?: number[], options?: BuildOptions): boolean` - Builds and saves synchronously
### TextDarts Class
The TextDarts class provides a class-based interface similar to Perl's Text::Darts module. It offers a more object-oriented approach to working with dictionaries and includes automatic resource management through JavaScript's garbage collection.
> **Note:** TextDarts class uses the Factory Method pattern and cannot be instantiated directly with `new TextDarts()`. Instead, use the static factory methods (`TextDarts.build()`, `TextDarts.load()`, or `TextDarts.new()`) to create instances. This design choice helps encapsulate the complexity of object creation, allows for runtime determination of object types, centralizes validation logic, and makes future changes to the creation process easier.
#### Static Methods
- `static new(source: string[] | string, values?: number[]): TextDarts` - Creates a new TextDarts object from words or a dictionary file
- `static build(keys: string[], values?: number[], options?: BuildOptions): TextDarts` - Creates a new TextDarts object from a word list
- `static load(filePath: string): TextDarts` - Creates a new TextDarts object from a dictionary file
- `static buildAndSave(keys: string[], filePath: string, values?: number[], options?: BuildOptions): Promise<boolean>` - Builds and saves a dictionary asynchronously
- `static buildAndSaveSync(keys: string[], filePath: string, values?: number[], options?: BuildOptions): boolean` - Builds and saves a dictionary synchronously
#### Instance Methods
- `replaceWords(text: string, replacer: WordReplacer): string` - Searches for dictionary words in a text and replaces them
- `exactMatchSearch(key: string): number` - Performs an exact match search
- `commonPrefixSearch(key: string): number[]` - Performs a common prefix search
- `traverse(key: string, callback: TraverseCallback): void` - Traverses the trie
- `load(filePath: string): Promise<boolean>` - Loads a dictionary file asynchronously
- `loadSync(filePath: string): boolean` - Loads a dictionary file synchronously
- `size(): number` - Gets the size of the dictionary
- `dispose(): void` - Releases resources (optional, resources will be automatically released when the object is garbage collected)
### Helper Functions
- `createDictionary(): Dictionary` - Creates a new Dictionary object
- `loadDictionary(filePath: string): Dictionary` - Loads a dictionary from a file
- `buildDictionary(keys: string[], values?: number[], options?: BuildOptions): Dictionary` - Builds a dictionary from keys and values
- `buildAndSaveDictionary(keys: string[], filePath: string, values?: number[], options?: BuildOptions): Promise<boolean>` - Builds and saves a dictionary asynchronously
- `buildAndSaveDictionarySync(keys: string[], filePath: string, values?: number[], options?: BuildOptions): boolean` - Builds and saves a dictionary synchronously
### WordReplacer Type
The `WordReplacer` type can be either:
1. A function that takes a matched word and returns a replacement string:
```typescript
(match: string) => string;
```
2. An object mapping words to their replacements:
```typescript
Record<string, string>;
```
### Build Options
- `progressCallback?: (current: number, total: number) => void` - Callback function for build progress
## Examples
See the [examples](./examples) directory for more usage examples:
- [Basic Usage](./examples/basic-usage.js)
- [Dictionary Builder](./examples/dictionary-builder.js)
- [Text Replacement](./examples/text-replacement.js)
- [Auto-Complete](./examples/auto-complete.js)
- [Error Handling](./examples/error-handling.js)
- [Morphological Analysis](./examples/morphological-analysis.js)
### Text Replacement Example
The `replaceWords` method allows you to search for dictionary words in a text and replace them with custom values. This is useful for tasks like:
- Text normalization
- Entity recognition and highlighting
- Content filtering
- Simple morphological analysis
```javascript
import { buildDictionary } from 'node-darts';
// Create a dictionary
const keys = ['apple', 'banana', 'orange', 'pineapple'];
const values = [1, 2, 3, 4];
const dict = buildDictionary(keys, values);
// Replace words in text using a function
const text = 'I like apple and pineapple.';
const replaced = dict.replaceWords(text, (word) => `${word.toUpperCase()}`);
console.log(replaced); // "I like APPLE and PINEAPPLE."
// Replace words using an object mapping
const mapping = {
apple: 'red apple',
pineapple: 'yellow pineapple',
};
const replaced2 = dict.replaceWords(text, mapping);
console.log(replaced2); // "I like red apple and yellow pineapple."
// Clean up resources
dict.dispose();
```
### TextDarts Class Example
The TextDarts class provides a more object-oriented approach and includes automatic resource management:
```javascript
import { TextDarts } from 'node-darts';
// Create a TextDarts object from a word list
const keys = ['apple', 'banana', 'orange', 'pineapple'];
const values = [1, 2, 3, 4];
const darts = TextDarts.build(keys, values);
// Perform searches
console.log(darts.exactMatchSearch('apple')); // 1
console.log(darts.commonPrefixSearch('pineapple')); // [1, 4]
// Replace words in text
const text = 'I like apple and pineapple.';
const replaced = darts.replaceWords(text, (word) => `${word.toUpperCase()}`);
console.log(replaced); // "I like APPLE and PINEAPPLE."
// Resources will be automatically released when the object is garbage collected
// But you can explicitly release them if needed
darts.dispose();
// Load an existing dictionary file
const loadedDarts = TextDarts.load('/path/to/dictionary.darts');
console.log(loadedDarts.exactMatchSearch('apple')); // Returns the value if found
```
### Advanced Usage: Text Replacement with Context
You can use the `replaceWords` method for more advanced text processing:
```javascript
import { TextDarts } from 'node-darts';
// Create a dictionary with terms to highlight
const terms = ['JavaScript', 'TypeScript', 'Node.js', 'Darts', 'Trie'];
const darts = TextDarts.build(terms);
// Text to process
const article = `
Node.js applications can be written in JavaScript or TypeScript.
This library uses the Darts algorithm to implement a Trie structure.
`;
// Replace with HTML tags for highlighting
const highlighted = darts.replaceWords(article, (term) => {
return `<span class="highlight">${term}</span>`;
});
console.log(highlighted);
// Output:
// <span class="highlight">Node.js</span> applications can be written in <span class="highlight">JavaScript</span> or <span class="highlight">TypeScript</span>.
// This library uses the <span class="highlight">Darts</span> algorithm to implement a <span class="highlight">Trie</span> structure.
// Clean up resources (optional)
darts.dispose();
```
## Error Handling
The library provides the following custom error classes:
- `DartsError` - Base error class
- `FileNotFoundError` - Thrown when a file is not found
- `InvalidDictionaryError` - Thrown when an invalid dictionary file is encountered
- `BuildError` - Thrown when dictionary building fails
## License
MIT
## Acknowledgements
This project uses the Darts (Double-ARray Trie System) library, which is distributed under the BSD license and LGPL.
## Implementation Notes
The original Darts library has been modified for C++17 compatibility:
- Removed the `register` keyword which is deprecated in C++17
- No functional changes were made to the library
- The original copyright and license notices have been preserved
These modifications were made in accordance with the project requirements to respect the original code while ensuring compatibility with modern C++ standards.