modern-ahocorasick
Version:
modern-ahocorasick
155 lines (97 loc) • 3.45 kB
Markdown
# modern-ahocorasick
> Forked from `https://github.com/BrunoRB/ahocorasick` and make it modern! Thanks to the author(`BrunoRB`) of `ahocorasick`
Implementation of the Aho-Corasick string searching algorithm, as described in the paper "Efficient string matching: an aid to bibliographic search".
this pkg has `cjs` and `esm` format, and have `.d.ts` file.
## Install
```sh
npm i modern-ahocorasick
yarn add modern-ahocorasick
pnpm i modern-ahocorasick
```
## Usage
```ts
// esm
import AhoCorasick from 'modern-ahocorasick'
// cjs
const AhoCorasick = require('modern-ahocorasick')
const ac = new AhoCorasick(['keyword1', 'keyword2', 'etc'])
const results = ac.search('should find keyword1 at position 19 and keyword2 at position 47.')
// [ [ 19, [ 'keyword1' ] ], [ 47, [ 'keyword2' ] ] ]
```
## Visualization
See <https://brunorb.github.io/ahocorasick/visualization.html> for an interactive visualization of the algorithm.
## API
### Constructor
#### `constructor(keywords: string[])`
Initializes the Aho-Corasick state machine with the provided `keywords`.
**Parameters**:
- `keywords`: An array of strings representing the keywords to search for.
**Example**:
```typescript
const keywords = ['he', 'she', 'his', 'hers']
const ac = new AhoCorasick(keywords)
```
---
### Methods
#### `search(str: string): [number, string[]][]`
Searches the input string `str` for occurrences of any of the keywords and returns a list of matches.
**Parameters**:
- `str`: The input string to search.
**Returns**:
- An array of tuples. Each tuple contains:
- The ending index of the match in the input string.
- An array of matched keywords at that position.
**Example**:
```typescript
const ac = new AhoCorasick(['keyword1', 'keyword2', 'etc'])
const results = ac.search('should find keyword1 at position 19 and keyword2 at position 47.')
// [ [ 19, [ 'keyword1' ] ], [ 47, [ 'keyword2' ] ] ]
```
---
#### `match(str: string): boolean`
Checks if any keyword exists in the input string `str`.
**Parameters**:
- `str`: The input string to search.
**Returns**:
- `true` if any keyword is found.
- `false` otherwise.
**Example**:
```typescript
const ac = new AhoCorasick(['he', 'she', 'his', 'hers'])
console.log(ac.match('ushers')) // Output: true
console.log(ac.match('xyz')) // Output: false
```
---
### Internal Functionality
While the `_buildTables` method is not part of the public API, it is responsible for building the transition (`gotoFn`), output, and failure functions used by the Aho-Corasick algorithm.
---
## Examples
### Example 1: Basic Search
```typescript
const keywords = ['cat', 'bat', 'rat']
const ac = new AhoCorasick(keywords)
const text = 'the cat chased the rat while a bat flew by'
const matches = ac.search(text)
```
### Example 2: Check Match Presence
```typescript
const ac = new AhoCorasick(['abc', '123'])
console.log(ac.match('hello abc world')) // Output: true
console.log(ac.match('hello world')) // Output: false
```
---
### Example: Case-Insensitive Search
```typescript
const keywords = ['hello', 'world']
const ac = new AhoCorasick(keywords.map(k => k.toLowerCase()))
const text = 'Hello World'
const matches = ac.search(text.toLowerCase())
console.log(matches)
// Output: [
// [4, ["hello"]],
// [10, ["world"]]
// ]
```
This document serves as a complete guide for using the `AhoCorasick` class for multi-pattern string matching.
## License
[The MIT License](LICENSE)