UNPKG

@haz3y0ne/parsexl

Version:

Parses Excel formulas into a clean, well-typed abstract syntax tree you can analyse or evaluate in TypeScript.

167 lines (123 loc) 5.8 kB
# parsexl A lightweight, dependency-free TypeScript library that parses raw Microsoft Excel formulas into well‑structured, strongly‑typed Abstract Syntax Trees (ASTs). Perfect for analysis, transformation, or formula tooling—without needing Excel or Office integration. > Inspired by [psalaets/excel-formula-ast](https://github.com/psalaets/excel-formula-ast), this parser reimagines the approach using a Pratt (TDOP) parser instead of a Shunting Yard algorithm, improving readability, control, and extensibility. ![TypeScript](https://img.shields.io/badge/Language-TypeScript-3178C6?style=flat&logo=typescript&logoColor=white) ![Parser](<https://img.shields.io/badge/Parser-Pratt%20(TDOP)-blueviolet>) ![Runtime](https://img.shields.io/badge/Runtime-0%20Dependencies-lightgrey) ![Excel](https://img.shields.io/badge/Excel-Compatible-yellowgreen) --- ## ✨ Highlights | Feature | Description | | ----------------------------- | -------------------------------------------------------------------------------------------------------------------- | | Fully lossless token stream | Lexer preserves every character—sheet names, errors, quotes, and all. | | Pratt (TDOP) parser | Operator precedence, prefix/postfix disambiguation, ranges, spills (`%`) all handled via a precedence-driven design. | | Structured transform pipeline | Clean post-processing via optional AST passes like `collapseBinary` and `normalizeFilters`. | | Type-safe & dependency-free | Published as both ESM and CJS, with full `.d.ts` support and no runtime dependencies. | | Diagnostic-friendly AST | Every node and token includes source `loc` offsets to enable mapping back to original input. | --- ## 📦 Installation ```bash npm i @haz3y0ne/parsexl ``` --- ## 🚀 Quick Start ```ts import { parseFormula } from "excel-formula-ast"; const ast = parseFormula('=SUM(FILTER(A1:C10,(B:B="West")*(C:C>1000)))'); console.dir(ast, { depth: null }); ``` Each node in the tree includes precise `loc.start` and `loc.end` positions, ideal for linting, diagnostics, or source mapping. --- ## 🧐 Example: Formula → AST ### Input ```txt =IF( AND(ISNUMBER(B2), B2>100), TEXT(TODAY(), "mm/dd/yyyy"), IF(ISBLANK(C2), "Missing", "Check") ) ``` ### Output (JSON) ```jsonc { "type": "IF", "args": [ { "type": "AND", "args": [ { "type": "ISNUMBER", "args": [{ "type": "CELL", "value": "B2" }] }, { "type": ">", "args": [ { "type": "CELL", "value": "B2" }, { "type": "NUMBER", "value": 100 } ] } ] }, { "type": "TEXT", "args": [ { "type": "TODAY" }, { "type": "LITERAL", "value": "mm/dd/yyyy" } ] }, { "type": "IF", "args": [ { "type": "ISBLANK", "args": [{ "type": "CELL", "value": "C2" }] }, { "type": "LITERAL", "value": "Missing" }, { "type": "LITERAL", "value": "Check" } ] } ] } ``` --- ## 📊 Supported Excel Functions Supports over 50 Excel worksheet functions across logical, lookup, math, text, and dynamic array categories. | Category | Examples | | ------------------------ | --------------------------------------------------------------------------------- | | Logical / Error Handling | `IF`, `AND`, `OR`, `NOT`, `IFERROR`, `IFNA`, `ISERROR`, `ISBLANK`, `ISNUMBER` | | Lookup & Reference | `INDEX`, `MATCH`, `VLOOKUP`, `XLOOKUP`, `CHOOSE` | | Math & Trigonometry | `SUM`, `ROUND`, `ABS`, `FLOOR`, `CEILING`, `MIN`, `MAX`, `RANK`, `ROUNDUP`, etc. | | Text | `TEXT`, `LEFT`, `RIGHT`, `SUBSTITUTE`, `TEXTJOIN`, `VALUE`, `LEN`, `CONCAT`, etc. | | Statistical | `COUNT`, `COUNTA`, `AVERAGE`, `SUMIF`, `SUMIFS`, `COUNTIF`, `COUNTBLANK` | | Dynamic Arrays | `LET`, `LAMBDA`, `FILTER`, `UNIQUE`, `SORT`, `SEQUENCE`, `RANDARRAY`, `SORTBY` | --- ## ⚙️ Architecture Overview ```bash string tokenize guessToken prattParse collapseBinary normalizeFilters AST ``` ### Components - `lexer/` Regex-based tokenizer (longest-match) - `parser/` Pratt parser with binding power table - `transforms/` Optional AST passes - `types/` All type definitions and function metadata - `parseFormula.ts` High-level orchestrator --- ## 📅 Testing - Lexer snapshots - Round‑trip validation (AST string Excel) - Complex transform edge cases ```bash npm test # Full test suite npm run test:lex # Lexer-only tests ``` --- ## ⚠️ Limitations - Array constants `{}` currently opaque - No full handling of intersection (space) or union (comma) yet - Spill ranges (`#`) are not parsed --- ## 🛠️ Extend It | Task | Where to Modify | | --------------------- | ------------------------------------------------ | | Add new function | `types/functions.ts` | | New token type | `lexer/patterns.ts` | | Custom transform pass | `transforms/` + export via `transforms/index.ts` | | Modify precedence | `parser/pratt.ts` | --- ## 📄 License MIT © 2025 Chris Moran _This parser is a standalone tool built atop the structure and semantics of Microsoft Excel formulas. It is not affiliated with or endorsed by Microsoft._