UNPKG

@jswalden/streaming-json

Version:

Streaming JSON parsing and stringification for JavaScript/TypeScript

147 lines (114 loc) 6.53 kB
# streaming-json This package implements streaming versions of [`JSON.parse`](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/JSON/parse) and [`JSON.stringify`](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/JSON/stringify) functionality. Read [the full API documentation](https://jswalden.github.io/streaming-json/) or a high-level package overview below. The operations in this package behave consistent with ECMAScript semantics, but modifications to various standard-library functionality can interfere with these semantics. (And of course user code between `stringify` iteration or `add(fragment)` operations can perform actions that alter the intermediate states dictated by ECMAScript semantics.) ## Stringification This package implements a `stringify` function that returns an iterable iterator over the fragments that constitute the JSON stringification of a value: ```js import { stringify } from "@jswalden/streaming-json"; async function writeAsJSONToFileAsync(value, file) { for (const frag of stringify(value, null, " ")) { await file.write(frag); } } ``` `stringify` implements JSON stringification where it's undesirable (or impossible because the entire stringification is too large to represent as a JS string or in memory) to compute the entire JSON string at once. It accepts the same arguments as `JSON.stringify` (albeit with narrower types to make clearer code). It returns an iterable iterator that yields successive fragments of the overall JSON stringification.[^between-emits] [^between-emits]: If the object graph being stringified is modified between calls to the iterator's `next()` function, stringification behavior will change in potentially unexpected ways. You should take care to protect your value being stringified from modification during the stringification process to prevent confusing behavior. Where fragment boundaries are placed is explicitly not defined. Thus for example `stringify(true, null, "")` might successively yield `"t"`, `"ru"`, `"e"` — or instead simply `"true"`. Don't make semantically visible distinctions based on where these boundaries occur! If any operation during iteration throws (e.g. property gets, `toJSON` invocations, stray `bigint` values in the graph), the `next()` call that triggers that operation will throw that value. As long as type signatures are respected, the stringification performed by `stringify` is the same as `JSON.stringify(value, replacer, space)` performs. However, one special case must be noted: if `JSON.stringify` would return the literal value `undefined` and not a string value[^stringify-not-string], the iterator returned by `stringify` will produce no fragments: ```js import { stringify } from "@jswalden/streaming-json"; const value = () => 42; let res = JSON.stringify(value, null, 2); assert(res === undefined); // not a string value! let frags = [...stringify(value, null, 2)]; assert(frags.length === 0); ``` [^stringify-not-string]: `JSON.stringify` returns `undefined` if the `value` passed to it is `undefined`, a [symbol](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Symbol), a callable object (i.e. `typeof value === "function"`), or an object whose `toJSON` property is a function that returns one of these values. It also returns `undefined` if a `replacer` function is supplied and if `replacer`, when invoked for `value`, returns `undefined`, a symbol, or a callable object. It's incumbent upon users who stringify sufficiently-broad values or use sufficiently-uncautious `replacer` functions to appropriately handle no fragments being iterated. ## Parsing This package exports a `StreamingJSONParser` class that can be used to incrementally parse fragments of a full JSON text. Create a `StreamingJSONParser`, feed it JSON fragments using `add(fragment)`, and then finish parsing and retrieve the result of parsing using `finish()` -- passing a `reviver` that behaves as the optional `reviver` argument to `JSON.parse` would if desired: ```js import { JSONParser } from "@jswalden/streaming-json"; const parser = new StreamingJSONParser(); parser.add("{"); parser.add('"property'); parser.add('Name": 1'); parser.add('7, "complex": {'); parser.add("}}"); const result = parser.finish(); assert(typeof result === "object" && result !== null); assert(result.propertyName === 17); assert(typeof result.complex === "object" && result.complex !== null); assert(Object.keys(result.complex).length === 0); const withReviver = new StreamingJSONParser(); withReviver.add("true"); const resultWithReviver = withReviver.finish(function(_name, _value) { // throws away `this[_name] === _value` where `_value === true` return 42; }); assert(resultWithReviver === 42); ``` If the fragments can't be the prefix of valid JSON, the `add(fragment)` that creates this condition will throw a `SyntaxError`. If the fragments aren't valid JSON at time `finish()` is called, `finish()` will throw a `SyntaxError`. `add(fragment)` and `finish()` may only be called while parsing is incomplete and has not fallen into error: after this the parser is no longer usable. ## Known issues ### `stringify` misinterprets boxed primitives from other globals as records `JSON.stringify` treats boxed primitives, e.g. `new Boolean(false)`, as if they were the primitive value. This happens even for boxed primitives from other global objects/realms, e.g. `new (window.open("about:blank").Boolean)(false)`. It's not possible to detect cross-global boxed primitives without substantially slowing down stringifying objects that aren't boxed primitives.[^inefficient] Therefore this package's `stringify` function [doesn't recognize cross-global boxed primitives as such](https://github.com/jswalden/streaming-json/issues/1) and instead interprets them as records of property/value pairs. [^inefficient]: Boxed primitives can be detected and unboxed, regardless of global/realm, using [`Number.prototype.valueOf`](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Number/valueOf) and similar: if the object is that kind of boxed primitive, the function call returns the primitive value, and if not it throws a `TypeError`. But an object that isn't a boxed primitive would incur exception creation/throwing/catching overhead *four times* for `Number`, `String`, `Boolean`, and `BigInt`: unacceptable overhead when cross-global objects are likely never used.