UNPKG

modern-ahocorasick

Version:
155 lines (97 loc) 3.45 kB
# modern-ahocorasick > Forked from `https://github.com/BrunoRB/ahocorasick` and make it modern! Thanks to the author(`BrunoRB`) of `ahocorasick` Implementation of the Aho-Corasick string searching algorithm, as described in the paper "Efficient string matching: an aid to bibliographic search". this pkg has `cjs` and `esm` format, and have `.d.ts` file. ## Install ```sh npm i modern-ahocorasick yarn add modern-ahocorasick pnpm i modern-ahocorasick ``` ## Usage ```ts // esm import AhoCorasick from 'modern-ahocorasick' // cjs const AhoCorasick = require('modern-ahocorasick') const ac = new AhoCorasick(['keyword1', 'keyword2', 'etc']) const results = ac.search('should find keyword1 at position 19 and keyword2 at position 47.') // [ [ 19, [ 'keyword1' ] ], [ 47, [ 'keyword2' ] ] ] ``` ## Visualization See <https://brunorb.github.io/ahocorasick/visualization.html> for an interactive visualization of the algorithm. ## API ### Constructor #### `constructor(keywords: string[])` Initializes the Aho-Corasick state machine with the provided `keywords`. **Parameters**: - `keywords`: An array of strings representing the keywords to search for. **Example**: ```typescript const keywords = ['he', 'she', 'his', 'hers'] const ac = new AhoCorasick(keywords) ``` --- ### Methods #### `search(str: string): [number, string[]][]` Searches the input string `str` for occurrences of any of the keywords and returns a list of matches. **Parameters**: - `str`: The input string to search. **Returns**: - An array of tuples. Each tuple contains: - The ending index of the match in the input string. - An array of matched keywords at that position. **Example**: ```typescript const ac = new AhoCorasick(['keyword1', 'keyword2', 'etc']) const results = ac.search('should find keyword1 at position 19 and keyword2 at position 47.') // [ [ 19, [ 'keyword1' ] ], [ 47, [ 'keyword2' ] ] ] ``` --- #### `match(str: string): boolean` Checks if any keyword exists in the input string `str`. **Parameters**: - `str`: The input string to search. **Returns**: - `true` if any keyword is found. - `false` otherwise. **Example**: ```typescript const ac = new AhoCorasick(['he', 'she', 'his', 'hers']) console.log(ac.match('ushers')) // Output: true console.log(ac.match('xyz')) // Output: false ``` --- ### Internal Functionality While the `_buildTables` method is not part of the public API, it is responsible for building the transition (`gotoFn`), output, and failure functions used by the Aho-Corasick algorithm. --- ## Examples ### Example 1: Basic Search ```typescript const keywords = ['cat', 'bat', 'rat'] const ac = new AhoCorasick(keywords) const text = 'the cat chased the rat while a bat flew by' const matches = ac.search(text) ``` ### Example 2: Check Match Presence ```typescript const ac = new AhoCorasick(['abc', '123']) console.log(ac.match('hello abc world')) // Output: true console.log(ac.match('hello world')) // Output: false ``` --- ### Example: Case-Insensitive Search ```typescript const keywords = ['hello', 'world'] const ac = new AhoCorasick(keywords.map(k => k.toLowerCase())) const text = 'Hello World' const matches = ac.search(text.toLowerCase()) console.log(matches) // Output: [ // [4, ["hello"]], // [10, ["world"]] // ] ``` This document serves as a complete guide for using the `AhoCorasick` class for multi-pattern string matching. ## License [The MIT License](LICENSE)