@haz3y0ne/parsexl
Version:
Parses Excel formulas into a clean, well-typed abstract syntax tree you can analyse or evaluate in TypeScript.
167 lines (123 loc) • 5.8 kB
Markdown
# parsexl
A lightweight, dependency-free TypeScript library that parses raw Microsoft Excel formulas into well‑structured, strongly‑typed Abstract Syntax Trees (ASTs). Perfect for analysis, transformation, or formula tooling—without needing Excel or Office integration.
> Inspired by [psalaets/excel-formula-ast](https://github.com/psalaets/excel-formula-ast), this parser reimagines the approach using a Pratt (TDOP) parser instead of a Shunting Yard algorithm, improving readability, control, and extensibility.

-blueviolet>)


## ✨ Highlights
| Feature | Description |
| ----------------------------- | -------------------------------------------------------------------------------------------------------------------- |
| Fully lossless token stream | Lexer preserves every character—sheet names, errors, quotes, and all. |
| Pratt (TDOP) parser | Operator precedence, prefix/postfix disambiguation, ranges, spills (`%`) all handled via a precedence-driven design. |
| Structured transform pipeline | Clean post-processing via optional AST passes like `collapseBinary` and `normalizeFilters`. |
| Type-safe & dependency-free | Published as both ESM and CJS, with full `.d.ts` support and no runtime dependencies. |
| Diagnostic-friendly AST | Every node and token includes source `loc` offsets to enable mapping back to original input. |
## 📦 Installation
```bash
npm i @haz3y0ne/parsexl
```
## 🚀 Quick Start
```ts
import { parseFormula } from "excel-formula-ast";
const ast = parseFormula('=SUM(FILTER(A1:C10,(B:B="West")*(C:C>1000)))');
console.dir(ast, { depth: null });
```
Each node in the tree includes precise `loc.start` and `loc.end` positions, ideal for linting, diagnostics, or source mapping.
## 🧐 Example: Formula → AST
### Input
```txt
=IF(
AND(ISNUMBER(B2), B2>100),
TEXT(TODAY(), "mm/dd/yyyy"),
IF(ISBLANK(C2), "Missing", "Check")
)
```
### Output (JSON)
```jsonc
{
"type": "IF",
"args": [
{
"type": "AND",
"args": [
{ "type": "ISNUMBER", "args": [{ "type": "CELL", "value": "B2" }] },
{
"type": ">",
"args": [
{ "type": "CELL", "value": "B2" },
{ "type": "NUMBER", "value": 100 }
]
}
]
},
{
"type": "TEXT",
"args": [
{ "type": "TODAY" },
{ "type": "LITERAL", "value": "mm/dd/yyyy" }
]
},
{
"type": "IF",
"args": [
{ "type": "ISBLANK", "args": [{ "type": "CELL", "value": "C2" }] },
{ "type": "LITERAL", "value": "Missing" },
{ "type": "LITERAL", "value": "Check" }
]
}
]
}
```
## 📊 Supported Excel Functions
Supports over 50 Excel worksheet functions across logical, lookup, math, text, and dynamic array categories.
| Category | Examples |
| ------------------------ | --------------------------------------------------------------------------------- |
| Logical / Error Handling | `IF`, `AND`, `OR`, `NOT`, `IFERROR`, `IFNA`, `ISERROR`, `ISBLANK`, `ISNUMBER` |
| Lookup & Reference | `INDEX`, `MATCH`, `VLOOKUP`, `XLOOKUP`, `CHOOSE` |
| Math & Trigonometry | `SUM`, `ROUND`, `ABS`, `FLOOR`, `CEILING`, `MIN`, `MAX`, `RANK`, `ROUNDUP`, etc. |
| Text | `TEXT`, `LEFT`, `RIGHT`, `SUBSTITUTE`, `TEXTJOIN`, `VALUE`, `LEN`, `CONCAT`, etc. |
| Statistical | `COUNT`, `COUNTA`, `AVERAGE`, `SUMIF`, `SUMIFS`, `COUNTIF`, `COUNTBLANK` |
| Dynamic Arrays | `LET`, `LAMBDA`, `FILTER`, `UNIQUE`, `SORT`, `SEQUENCE`, `RANDARRAY`, `SORTBY` |
## ⚙️ Architecture Overview
```bash
string → tokenize → guessToken → prattParse → collapseBinary → normalizeFilters → AST
```
### Components
- `lexer/` — Regex-based tokenizer (longest-match)
- `parser/` — Pratt parser with binding power table
- `transforms/` — Optional AST passes
- `types/` — All type definitions and function metadata
- `parseFormula.ts` — High-level orchestrator
## 📅 Testing
- Lexer snapshots
- Round‑trip validation (AST → string → Excel)
- Complex transform edge cases
```bash
npm test # Full test suite
npm run test:lex # Lexer-only tests
```
## ⚠️ Limitations
- Array constants `{}` currently opaque
- No full handling of intersection (space) or union (comma) yet
- Spill ranges (`#`) are not parsed
## 🛠️ Extend It
| Task | Where to Modify |
| --------------------- | ------------------------------------------------ |
| Add new function | `types/functions.ts` |
| New token type | `lexer/patterns.ts` |
| Custom transform pass | `transforms/` + export via `transforms/index.ts` |
| Modify precedence | `parser/pratt.ts` |
## 📄 License
MIT © 2025 Chris Moran
_This parser is a standalone tool built atop the structure and semantics of Microsoft Excel formulas. It is not affiliated with or endorsed by Microsoft._