UNPKG

@kamilmielnik/trie

Version:

Trie data structure implementation in TypeScript. Highly performant. No dependencies. Built for a Scrabble Solver.

192 lines (140 loc) β€’ 8.04 kB
# @kamilmielnik/trie ![Version](https://img.shields.io/github/package-json/v/kamilmielnik/trie) ![License](https://img.shields.io/npm/l/@kamilmielnik/trie) ![Node version](https://img.shields.io/node/v/@kamilmielnik/trie) ![Test](https://github.com/kamilmielnik/trie/workflows/Test/badge.svg) ![Coverage](https://img.shields.io/badge/coverage-100%25-brightgreen.svg) ![Prettier](https://github.com/kamilmielnik/trie/workflows/Prettier/badge.svg) [Trie](https://en.wikipedia.org/wiki/Trie) data structure implemented in TypeScript: - [Highly performant](#performance) - No dependencies - Lightweight - [Well-documented](docs/README.md) - Built for [Scrabble Solver](https://github.com/kamilmielnik/scrabble-solver) # Table of contents - [Installation](#installation) - [API](#api) - [Object-oriented API](#object-oriented-api) - [Functional API](#functional-api) - [Examples](#examples) - [Load dictionary from file](#load-dictionary-from-file) - [Serialize `Node` to a file](#serialize-node-to-a-file) - [Load serialized `Node` from a file](#load-serialized-node-from-a-file) - [Find all words with given prefix](#find-all-words-with-given-prefix) - [Serialization and compression](#serialization-and-compression) - [Performance](#performance) # Installation ## npm ```Shell npm install @kamilmielnik/trie --save ``` ## yarn ```Shell yarn add @kamilmielnik/trie ``` # [API](docs/README.md) See full [API Docs](docs/README.md) - generated by [typedoc](http://typedoc.org/). Good to know: - all objects are mutable - every class, interface, type, constant and function is exported - all exports are named (there is no default export) There are 2 ways to use the API. ## [Object-oriented API](docs/README.md#classes) Create [`Trie`](docs/classes/Trie.md) instance and use its methods. ### Example ```ts import { Trie } from '@kamilmielnik/trie'; const trie = new Trie(); trie.add('master'); trie.add('mask'); trie.hasPrefix('man'); // false trie.hasPrefix('mas'); // true trie.has('mas'); // false trie.remove('mas'); // false trie.has('master'); // true trie.serialize(); // "(m(a(s(t(e(r))k))))" trie.remove('master'); // true trie.serialize(); // "(m(a(s(k))))" ``` ## [Functional API](docs/README.md#functions) Manipulate existing instances of [`Node`](docs/interfaces/Node.md) with [functions](docs/README.md#functions). ### Example The following example works identically to the object-oriented example above. ```ts import { add, has, hasPrefix, Node, remove, serialize } from '@kamilmielnik/trie'; const root: Node = {}; add(root, 'master'); add(root, 'mask'); hasPrefix(root, 'man'); // false hasPrefix(root, 'mas'); // true has(root, 'mas'); // false remove(root, 'mas'); // false has(root, 'master'); // true serialize(root); // "(m(a(s(t(e(r))k))))" remove(root, 'master'); // true serialize(root); // "(m(a(s(k))))" ``` # Examples - [Load dictionary from file](#load-dictionary-from-file) - [Serialize `Node` to a file](#serialize-node-to-a-file) - [Load serialized `Node` from a file](#load-serialized-node-from-a-file) - [Find all words with given prefix](#find-all-words-with-given-prefix) ## Load dictionary from file ```ts import { fromArray, Node } from '@kamilmielnik/trie'; import fs from 'fs'; const fromFile = (filepath: string): Node => { const file = fs.readFileSync(filepath, 'utf-8'); // Assuming file contains 1 word per line const words = file.split('\n').filter(Boolean); const node = fromArray(words); return node; }; ``` ## Serialize [`Node`](docs/interfaces/Node.md) to a file ```ts import { Trie } from '@kamilmielnik/trie'; import fs from 'fs'; const toFile = (filepath: string, trie: Trie): void => { const serialized = trie.serialize(); fs.writeFileSync(filepath, serialized); }; ``` ## Load serialized [`Node`](docs/interfaces/Node.md) from a file ```ts import { deserialize, Node } from '@kamilmielnik/trie'; import fs from 'fs'; const fromFile = (filepath: string): Node => { const serialized = fs.readFileSync(filepath, 'utf-8'); const node = deserialize(serialized); return node; }; ``` ## Find all words with given prefix ```ts import { find, Node, toArray } from '@kamilmielnik/trie'; const findWordsWithPrefix = (node: Node, prefix: string): string[] => { const prefixNode = find(node, prefix) || {}; const descendants = toArray(prefixNode, { prefix, sort: true, wordsOnly: true }); const words = descendants.map(({ prefix: word }) => word); return words; }; ``` # Serialization and compression This package can be used to efficiently [serialize](docs/functions/serialize.md) and compress dictionaries. It reaches 54.79 [compression ratio](https://en.wikipedia.org/wiki/Data_compression_ratio) (98.17% space saving) for Polish dictionary when combined with [7-Zip](https://en.wikipedia.org/wiki/7z) at ultra compression level. | Language | πŸ‡ΊπŸ‡Έ en-US | πŸ‡¬πŸ‡§ en-GB | πŸ‡΅πŸ‡± pl-PL | | ------------------------------------------------------------------- | ----------------------------------------------------------------------- | --------------------------------------------------------------------------- | ----------------------------------------- | | Name | [TWL06](https://en.wikipedia.org/wiki/NASPA_Word_List) | [SOWPODS](https://en.wikipedia.org/wiki/Collins_Scrabble_Words) | [SJP.PL](https://sjp.pl/slownik/dp.phtml) | | Source | [Download](https://www.wordgamedictionary.com/twl06/download/twl06.txt) | [Download](https://www.wordgamedictionary.com/sowpods/download/sowpods.txt) | [Download](https://sjp.pl/slownik/growy/) | | Words count | 178,691 | 267,753 | 3,045,878 | | File size [B] | 1,941,856 | 2,974,773 | 42,838,508 | | File size (serialized) [B] | (-29.75%) 1,364,242 | (-31.57%) 2,035,697 | (-56.33%) 18,705,990 | | File size ([7z](https://en.wikipedia.org/wiki/7z)) [B] | (-80.46%) 379,483 | (-81.04%) 563,913 | (-87.58%) 5,320,397 | | File size (serialized + [7z](https://en.wikipedia.org/wiki/7z)) [B] | (-89.94%) 195,299 | (-90.40%) 285,430 | (-98.17%) 781,875 | # Performance [`add`](docs/functions/add.md), [`find`](docs/functions/find.md), [`has`](docs/functions/has.md), [`hasPrefix`](docs/functions/hasPrefix.md), [`remove`](docs/functions/remove.md) are very fast - $O(\log_2 n)$ (millions of operations per second). ![image](https://user-images.githubusercontent.com/6830683/100113381-d5b63580-2ea2-11eb-8ed6-0e03515b5d9a.png) --- [`deserialize`](docs/functions/deserialize.md), [`fromArray`](docs/functions/fromArray.md), [`serialize`](docs/functions/serialize.md), [`toArray`](docs/functions/toArray.md) are slow - $O(n)$ (few operations per second). ![image](https://user-images.githubusercontent.com/6830683/100113526-fe3e2f80-2ea2-11eb-9b1e-22ae954e297a.png)