UNPKG

apg-unicode

Version:

JavaScript APG parser of Unicode code point arrays

95 lines (62 loc) 4.61 kB
# APG Unicode Parser Parsers created with [`apg-js`](https://github.com/ldthomas/apg-js) and [`apg-lite`](https://github.com/ldthomas/apg-lite) operate on arrays of positive integers—typically representing character codes. The `apg-unicode` variant extends this by supporting **typed arrays**, enabling more memory-efficient parsing workflows for modern JavaScript environments. > **Note:** `apg-unicode` does not natively parse Unicode. Instead, Unicode handling must be implemented via SABNF grammar and application logic. Typed arrays and conversion utilities simplify this process. See `./examples/unicode` for an illustration of UTF-8 and UTF-16 parsing without prior transformation. ## Key Features ### Typed Array Support `apg-unicode` accepts the following input types: - `Array` - `Buffer` - `Uint8Array` - `Uint16Array` - `Uint32Array` - `String` (converted internally to `Uint32Array` of code points) Using typed arrays—especially `Uint8Array`—can reduce memory usage by up to **75%** for large UTF-8 files. ### Substring Parsing Efficiently parse substrings within large strings without slicing or reallocating. Ideal for partial parsing scenarios. See `./examples/substrings` for usage patterns. ## Parser Generation Like `apg-lite`, `apg-unicode` does **not** include a parser generator. To generate a grammar object, for example: ```bash npm run apg -- -i ./examples/stats/sip.bnf -o ./examples/stats/sip ``` ## GitHub Usage Clone the repo and run the user application and examples from the root directory: ```bash git clone https://github.com/ldthomas/apg-unicode.git cd apg-unicode ``` Include the modules in an application with: ```bash import { Parser } from './src/parser.js'; import { Ast } from './src/ast.js'; import { Trace } from './src/tracer.js'; import { Stats } from './src/stats.js'; import { utilities } from './src/utilities.js'; import { identifiers } from './src/identifiers.js'; ``` To run the examples use: | Command | Description | | ---------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `node examples/ast/main` | Demonstrates Abstract Syntax Tree (AST) usage | | `node examples/trace/main` | Traces the parser through the parse tree | | `node examples/stats/main` | Collects and displays node hit statistics | | `node examples/substrings/main` | Parses substrings within a full input string | | `node examples/unicode/main` | Parses UTF-8 and UTF-16 directly without prior transformation to code points | | display `examples/web/web.html` in any browser | Illustrates running a parser in a web page. Note that `web-app.js` is created with [esbuild](https://github.com/evanw/esbuild) from `app.js`. Use the script `npm run esbuild`. | ## npm Usage Install the repo from the npm registry. In the application root directory: ```bash npm install apg-unicode ``` To access the modules in the application: ```bash import { Parser, Ast, Trace, Stats, utilities, identifiers } from 'apg-unicode'; ``` ## Documentation The documentation is in in the code in [docco](https://davidwalsh.name/javascript-documentation) format. To generate it use: ```bash npm run docco ``` The documentation will then be in at `./docs/index.html` Or view it [here](https://sabnf.com/docs/apg-unicode/index.html) on the APG website. ## License `apg-unicode` is licensed under the permissive [MIT](https://github.com/ldthomas/apg-unicode?tab=License-1-ov-file) license.