UNPKG

stax-xml

Version:
308 lines (226 loc) โ€ข 12.5 kB
# StAX-XML [English](#english) | [ํ•œ๊ตญ์–ด](#korean) --- ## English A high-performance, pull-based XML parser for JavaScript/TypeScript inspired by Java's StAX (Streaming API for XML). It offers both **fully asynchronous, stream-based parsing** for large files and **synchronous parsing** for smaller, in-memory XML documents. Unlike traditional XML-to-JSON mappers, StAX-XML allows you to map XML data to any custom structure you desire while efficiently handling XML files through streaming or direct string processing. ### ๐Ÿš€ Features - **Fully Asynchronous (Stream-based)**: For memory-efficient processing of large XML files. - **Synchronous (String-based)**: For high-performance parsing of smaller, in-memory XML strings. - **Pull-based Parsing**: Stream-based approach for memory-efficient processing of large XML files - **Custom Mapping**: Map XML data to any structure you want, not just plain JSON objects - **High Performance**: Optimized for speed and low memory usage - **Universal Compatibility**: Works in Node.js, Bun, Deno, and web browsers using only Web Standard APIs - **Namespace Support**: Basic XML namespace handling - **Entity Support**: Built-in entity decoding with custom entity support - **TypeScript Ready**: Full TypeScript support with comprehensive type definitions ### ๐Ÿ“ฆ Installation ```bash # npm npm install stax-xml # yarn yarn add stax-xml # pnpm pnpm add stax-xml # bun bun add stax-xml # deno deno add npm:stax-xml ``` ### ๐Ÿ”ง Quick Start Here are basic examples to get started. For detailed usage and API references, please refer to the dedicated documentation files: - [**StaxXmlParser (Asynchronous)**](docs/StaxXmlParser.md): For parsing XML from `ReadableStream`. - [**StaxXmlParserSync (Synchronous)**](docs/StaxXmlParserSync.md): For parsing XML from `string`. - [**StaxXmlWriter**](docs/StaxXmlWriter.md): For writing XML to `string`. #### Basic Asynchronous Parsing (StaxXmlParser) ```typescript import { StaxXmlParser, XmlEventType } from 'stax-xml'; const xmlContent = '<root><item>Hello</item></root>'; const stream = new ReadableStream({ start(controller) { controller.enqueue(new TextEncoder().encode(xmlContent)); controller.close(); } }); async function parseXml() { const parser = new StaxXmlParser(stream); for await (const event of parser) { console.log(event); } } parseXml(); ``` #### Basic Synchronous Parsing (StaxXmlParserSync) ```typescript import { StaxXmlParserSync, XmlEventType } from 'stax-xml'; const xmlContent = '<data><value>123</value></data>'; const parser = new StaxXmlParserSync(xmlContent); for (const event of parser) { console.log(event); } ``` ### ๐ŸŒ Platform Compatibility StAX-XML uses only Web Standard APIs, making it compatible with: - **Node.js** (v18+) - **Bun** (any version) - **Deno** (any version) - **Web Browsers** (modern browsers) - **Edge Runtime** (Vercel, Cloudflare Workers, etc.) ### ๐Ÿงช Testing ```bash bun test ``` #### Benchmark Results **Disclaimer:** These benchmarks were performed on a specific system (`cpu: 13th Gen Intel(R) Core(TM) i5-13600K`, `runtime: node 22.17.0 (x64-win32)`) and may vary on different hardware and environments. **large.xml (97MB) parsing** | Benchmark | avg (min โ€ฆ max) | p75 / p99 | Memory (avg) | | :------------------ | :-------------- | :-------------- | :----------- | | stax-xml to object | 4.36 s/iter | 4.42 s | 2.66 mb | | stax-xml consume | 3.61 s/iter | 3.65 s | 3.13 mb | | xml2js | 6.00 s/iter | 6.00 s | 1.80 mb | | fast-xml-parser | 4.25 s/iter | 4.26 s | 151.81 mb | | txml | 1.05 s/iter | 1.06 s | 179.81 mb | **midsize.xml (13MB) parsing** | Benchmark | avg (min โ€ฆ max) | p75 / p99 | Memory (avg) | | :------------------ | :-------------- | :-------------- | :----------- | | stax-xml to object | 492.06 ms/iter | 493.28 ms | 326.28 kb | | stax-xml consume | 469.66 ms/iter | 471.54 ms | 174.51 kb | | xml2js | 163.26 ยตs/iter | 161.20 ยตs | 89.89 kb | | fast-xml-parser | 529.99 ms/iter | 531.12 ms | 1.92 mb | | txml | 112.81 ms/iter | 113.26 ms | 1.00 mb | **complex.xml (2KB) parsing** | Benchmark | avg (min โ€ฆ max) | p75 / p99 | Memory (avg) | | :------------------ | :-------------- | :-------------- | :----------- | | stax-xml to object | 85.79 ยตs/iter | 75.60 ยตs | 105.11 kb | | stax-xml consume | 50.38 ยตs/iter | 49.43 ยตs | 271.12 b | | xml2js | 147.45 ยตs/iter | 153.50 ยตs | 89.42 kb | | fast-xml-parser | 101.11 ยตs/iter | 102.20 ยตs | 92.92 kb | | txml | 9.40 ยตs/iter | 9.41 ยตs | 125.89 b | **books.xml (4KB) parsing** | Benchmark | avg (min โ€ฆ max) | p75 / p99 | Memory (avg) | | :------------------ | :-------------- | :-------------- | :----------- | | stax-xml to object | 166.73 ยตs/iter | 156.20 ยตs | 221.40 kb | | stax-xml consume | 176.45 ยตs/iter | 151.70 ยตs | 202.08 kb | | xml2js | 259.90 ยตs/iter | 254.50 ยตs | 161.25 kb | | fast-xml-parser | 239.57 ยตs/iter | 203.30 ยตs | 226.17 kb | | txml | 19.18 ยตs/iter | 19.26 ยตs | 303.13 b | ### ๐Ÿ“ Sample File Sources Sources of sample XML files used in testing: - `books.xml`: [Microsoft XML Document Examples](https://learn.microsoft.com/en-us/previous-versions/windows/desktop/ms762271(v=vs.85)) - `simple-namespace.xml`: [W3Schools XML Namespaces Guide](https://www.w3schools.com/xml/xml_namespaces.asp) - `treebank_e.xml`: [University of Washington XML Data Repository](https://aiweb.cs.washington.edu/research/projects/xmltk/xmldata/www/repository.html) ### ๐Ÿ“„ License MIT ### ๐Ÿค Contributing Contributions are welcome! Please feel free to submit a Pull Request. --- ## Korean Java์˜ StAX(Streaming API for XML)์—์„œ ์˜๊ฐ์„ ๋ฐ›์€ ๊ณ ์„ฑ๋Šฅ pull ๋ฐฉ์‹์˜ JavaScript/TypeScript XML ํŒŒ์„œ์ž…๋‹ˆ๋‹ค. **๋Œ€์šฉ๋Ÿ‰ ํŒŒ์ผ์„ ์œ„ํ•œ ์™„์ „ ๋น„๋™๊ธฐ ์ŠคํŠธ๋ฆผ ๊ธฐ๋ฐ˜ ํŒŒ์‹ฑ**๊ณผ **์ž‘์€ ์ธ๋ฉ”๋ชจ๋ฆฌ XML ๋ฌธ์„œ๋ฅผ ์œ„ํ•œ ๋™๊ธฐ ํŒŒ์‹ฑ**์„ ๋ชจ๋‘ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค. ๊ธฐ์กด์˜ XML-JSON ๋งคํผ์™€ ๋‹ฌ๋ฆฌ, StAX-XML์„ ์‚ฌ์šฉํ•˜๋ฉด XML ๋ฐ์ดํ„ฐ๋ฅผ ์›ํ•˜๋Š” ์ž„์˜์˜ ๊ตฌ์กฐ๋กœ ๋งคํ•‘ํ•  ์ˆ˜ ์žˆ์œผ๋ฉฐ, ์ŠคํŠธ๋ฆฌ๋ฐ ๋˜๋Š” ์ง์ ‘ ๋ฌธ์ž์—ด ์ฒ˜๋ฆฌ๋ฅผ ํ†ตํ•ด XML ํŒŒ์ผ์„ ํšจ์œจ์ ์œผ๋กœ ์ฒ˜๋ฆฌํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ### ๐Ÿš€ ์ฃผ์š” ๊ธฐ๋Šฅ - **์™„์ „ ๋น„๋™๊ธฐ (์ŠคํŠธ๋ฆผ ๊ธฐ๋ฐ˜)**: ๋Œ€์šฉ๋Ÿ‰ XML ํŒŒ์ผ์˜ ๋ฉ”๋ชจ๋ฆฌ ํšจ์œจ์  ์ฒ˜๋ฆฌ๋ฅผ ์œ„ํ•œ ์ŠคํŠธ๋ฆผ ๊ธฐ๋ฐ˜ ์ ‘๊ทผ - **๋™๊ธฐ (๋ฌธ์ž์—ด ๊ธฐ๋ฐ˜)**: ์ž‘์€ ์ธ๋ฉ”๋ชจ๋ฆฌ XML ๋ฌธ์ž์—ด์˜ ๊ณ ์„ฑ๋Šฅ ํŒŒ์‹ฑ์„ ์œ„ํ•œ ์ง์ ‘ ๋ฌธ์ž์—ด ์ฒ˜๋ฆฌ - **Pull ๋ฐฉ์‹ ํŒŒ์‹ฑ**: ๋Œ€์šฉ๋Ÿ‰ XML ํŒŒ์ผ์˜ ๋ฉ”๋ชจ๋ฆฌ ํšจ์œจ์  ์ฒ˜๋ฆฌ๋ฅผ ์œ„ํ•œ ์ŠคํŠธ๋ฆผ ๊ธฐ๋ฐ˜ ์ ‘๊ทผ - **์‚ฌ์šฉ์ž ์ •์˜ ๋งคํ•‘**: ๋‹จ์ˆœํ•œ JSON ๊ฐ์ฒด๊ฐ€ ์•„๋‹Œ ์›ํ•˜๋Š” ๊ตฌ์กฐ๋กœ XML ๋ฐ์ดํ„ฐ ๋งคํ•‘ ๊ฐ€๋Šฅ - **๊ณ ์„ฑ๋Šฅ**: ์†๋„์™€ ๋‚ฎ์€ ๋ฉ”๋ชจ๋ฆฌ ์‚ฌ์šฉ๋Ÿ‰์— ์ตœ์ ํ™” - **๋ฒ”์šฉ ํ˜ธํ™˜์„ฑ**: ์›น ํ‘œ์ค€ API๋งŒ ์‚ฌ์šฉํ•˜์—ฌ Node.js, Bun, Deno, ์›น ๋ธŒ๋ผ์šฐ์ €์—์„œ ๋ชจ๋‘ ๋™์ž‘ - **๋„ค์ž„์ŠคํŽ˜์ด์Šค ์ง€์›**: ๊ธฐ๋ณธ์ ์ธ XML ๋„ค์ž„์ŠคํŽ˜์ด์Šค ์ฒ˜๋ฆฌ - **์—”ํ‹ฐํ‹ฐ ์ง€์›**: ๋‚ด์žฅ ์—”ํ‹ฐํ‹ฐ ๋””์ฝ”๋”ฉ ๋ฐ ์‚ฌ์šฉ์ž ์ •์˜ ์—”ํ‹ฐํ‹ฐ ์ง€์› - **TypeScript ์ง€์›**: ํฌ๊ด„์ ์ธ ํƒ€์ž… ์ •์˜๋กœ ์™„์ „ํ•œ TypeScript ์ง€์› ### ๐Ÿ“ฆ ์„ค์น˜ ```bash # npm npm install stax-xml # yarn yarn add stax-xml # pnpm pnpm add stax-xml # bun bun add stax-xml # deno deno add npm:stax-xml ``` ### ๐Ÿ”ง ๋น ๋ฅธ ์‹œ์ž‘ ์ž์„ธํ•œ ์‚ฌ์šฉ ์˜ˆ์ œ ๋ฐ API ์ฐธ์กฐ๋Š” ๋‹ค์Œ ๋ฌธ์„œ ํŒŒ์ผ์„ ์ฐธ์กฐํ•˜์‹ญ์‹œ์˜ค: - [**StaxXmlParser (๋น„๋™๊ธฐ)**](docs/StaxXmlParser.md): `ReadableStream`์—์„œ XML์„ ํŒŒ์‹ฑํ•˜๋Š” ๋ฐ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค. - [**StaxXmlParserSync (๋™๊ธฐ)**](docs/StaxXmlParserSync.md): `string`์—์„œ XML์„ ํŒŒ์‹ฑํ•˜๋Š” ๋ฐ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค. - [**StaxXmlWriter**](docs/StaxXmlWriter.md): `string`์œผ๋กœ XML์„ ์ž‘์„ฑํ•˜๋Š” ๋ฐ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค. #### ๊ธฐ๋ณธ ๋น„๋™๊ธฐ ํŒŒ์‹ฑ (StaxXmlParser) ```typescript import { StaxXmlParser, XmlEventType } from 'stax-xml'; const xmlContent = '<root><item>์•ˆ๋…•ํ•˜์„ธ์š”</item></root>'; const stream = new ReadableStream({ start(controller) { controller.enqueue(new TextEncoder().encode(xmlContent)); controller.close(); } }); async function parseXml() { const parser = new StaxXmlParser(stream); for await (const event of parser) { console.log(event); } } parseXml(); ``` #### ๊ธฐ๋ณธ ๋™๊ธฐ ํŒŒ์‹ฑ (StaxXmlParserSync) ```typescript import { StaxXmlParserSync, XmlEventType } from 'stax-xml'; const xmlContent = '<data><value>123</value></data>'; const parser = new StaxXmlParserSync(xmlContent); for (const event of parser) { console.log(event); } ``` ### ๐ŸŒ ํ”Œ๋žซํผ ํ˜ธํ™˜์„ฑ StAX-XML์€ ์›น ํ‘œ์ค€ API๋งŒ์„ ์‚ฌ์šฉํ•˜์—ฌ ๋‹ค์Œ ํ™˜๊ฒฝ์—์„œ ๋™์ž‘ํ•ฉ๋‹ˆ๋‹ค: - **Node.js** (v18+) - **Bun** (๋ชจ๋“  ๋ฒ„์ „) - **Deno** (๋ชจ๋“  ๋ฒ„์ „) - **์›น ๋ธŒ๋ผ์šฐ์ €** (์ตœ์‹  ๋ธŒ๋ผ์šฐ์ €) - **Edge Runtime** (Vercel, Cloudflare Workers ๋“ฑ) ### ๐Ÿงช ํ…Œ์ŠคํŠธ ```bash bun test ``` #### ๋ฒค์น˜๋งˆํฌ ๊ฒฐ๊ณผ **๋ฉด์ฑ… ์กฐํ•ญ:** ์ด ๋ฒค์น˜๋งˆํฌ๋Š” ํŠน์ • ์‹œ์Šคํ…œ(`cpu: 13th Gen Intel(R) Core(TM) i5-13600K`, `runtime: node 22.17.0 (x64-win32)`)์—์„œ ์ˆ˜ํ–‰๋˜์—ˆ์œผ๋ฉฐ, ๋‹ค๋ฅธ ํ•˜๋“œ์›จ์–ด ๋ฐ ํ™˜๊ฒฝ์—์„œ๋Š” ๋‹ค๋ฅผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. **large.xml (97MB) ํŒŒ์‹ฑ** | ๋ฒค์น˜๋งˆํฌ | ํ‰๊ท  (์ตœ์†Œ โ€ฆ ์ตœ๋Œ€) | p75 / p99 | ๋ฉ”๋ชจ๋ฆฌ (ํ‰๊ท ) | | :------------------ | :-------------- | :-------------- | :----------- | | stax-xml to object | 4.36 s/iter | 4.42 s | 2.66 mb | | stax-xml consume | 3.61 s/iter | 3.65 s | 3.13 mb | | xml2js | 6.00 s/iter | 6.00 s | 1.80 mb | | fast-xml-parser | 4.25 s/iter | 4.26 s | 151.81 mb | | txml | 1.05 s/iter | 1.06 s | 179.81 mb | **midsize.xml (13MB) ํŒŒ์‹ฑ** | ๋ฒค์น˜๋งˆํฌ | ํ‰๊ท  (์ตœ์†Œ โ€ฆ ์ตœ๋Œ€) | p75 / p99 | ๋ฉ”๋ชจ๋ฆฌ (ํ‰๊ท ) | | :------------------ | :-------------- | :-------------- | :----------- | | stax-xml to object | 492.06 ms/iter | 493.28 ms | 326.28 kb | | stax-xml consume | 469.66 ms/iter | 471.54 ms | 174.51 kb | | xml2js | 163.26 ยตs/iter | 161.20 ยตs | 89.89 kb | | fast-xml-parser | 529.99 ms/iter | 531.12 ms | 1.92 mb | | txml | 112.81 ms/iter | 113.26 ms | 1.00 mb | **complex.xml (2KB) ํŒŒ์‹ฑ** | ๋ฒค์น˜๋งˆํฌ | ํ‰๊ท  (์ตœ์†Œ โ€ฆ ์ตœ๋Œ€) | p75 / p99 | ๋ฉ”๋ชจ๋ฆฌ (ํ‰๊ท ) | | :------------------ | :-------------- | :-------------- | :----------- | | stax-xml to object | 85.79 ยตs/iter | 75.60 ยตs | 105.11 kb | | stax-xml consume | 50.38 ยตs/iter | 49.43 ยตs | 271.12 b | | xml2js | 147.45 ยตs/iter | 153.50 ยตs | 89.42 kb | | fast-xml-parser | 101.11 ยตs/iter | 102.20 ยตs | 92.92 kb | | txml | 9.40 ยตs/iter | 9.41 ยตs | 125.89 b | **books.xml (4KB) ํŒŒ์‹ฑ** | ๋ฒค์น˜๋งˆํฌ | ํ‰๊ท  (์ตœ์†Œ โ€ฆ ์ตœ๋Œ€) | p75 / p99 | ๋ฉ”๋ชจ๋ฆฌ (ํ‰๊ท ) | | :------------------ | :-------------- | :-------------- | :----------- | | stax-xml to object | 166.73 ยตs/iter | 156.20 ยตs | 221.40 kb | | stax-xml consume | 176.45 ยตs/iter | 151.70 ยตs | 202.08 kb | | xml2js | 259.90 ยตs/iter | 254.50 ยตs | 161.25 kb | | fast-xml-parser | 239.57 ยตs/iter | 203.30 ยตs | 226.17 kb | | txml | 19.18 ยตs/iter | 19.26 ยตs | 303.13 b | ### ๐Ÿ“ ์ƒ˜ํ”Œ ํŒŒ์ผ ์ถœ์ฒ˜ ํ…Œ์ŠคํŠธ์— ์‚ฌ์šฉ๋œ ์ƒ˜ํ”Œ XML ํŒŒ์ผ๋“ค์˜ ์ถœ์ฒ˜: - `books.xml`: [Microsoft XML ๋ฌธ์„œ ์˜ˆ์ œ](https://learn.microsoft.com/en-us/previous-versions/windows/desktop/ms762271(v=vs.85)) - `simple-namespace.xml`: [W3Schools XML ๋„ค์ž„์ŠคํŽ˜์ด์Šค ๊ฐ€์ด๋“œ](https://www.w3schools.com/xml/xml_namespaces.asp) - `treebank_e.xml`: [University of Washington XML Data Repository](https://aiweb.cs.washington.edu/research/projects/xmltk/xmldata/www/repository.html) ### ๐Ÿ“„ ๋ผ์ด์„ ์Šค MIT ### ๐Ÿค ๊ธฐ์—ฌํ•˜๊ธฐ ๊ธฐ์—ฌ๋ฅผ ํ™˜์˜ํ•ฉ๋‹ˆ๋‹ค! Pull Request๋ฅผ ์ž์œ ๋กญ๊ฒŒ ์ œ์ถœํ•ด ์ฃผ์„ธ์š”.