schemind
Version:
Read and write to messages serialized as arrays (indexed keys) by defining a schema, enabling smaller message size when using protocols such as msgpack or JSON.
220 lines (188 loc) • 6.82 kB
Markdown
# schemind
[](https://www.npmjs.com/package/schemind)

[](https://bundlejs.com/?q=schemind&treeshake=%5B*%5D&config=%7B%22compression%22%3A%22brotli%22%7D)
Read and write to messages serialized as arrays (aka indexed keys messages) by defining a schema. Protocol‑agnostic.
## What?
In formats like JSON, a message normally looks something like this:
```json
{
"id": 1,
"fullName": "John Doe",
"email": "johndoe@example.com",
"birthDate": "1973-01-22",
"address": {
"street": "123 Main Street",
"city": "Anytown",
"zipcode": "12345-6789",
"geo": {
"lat": 42.1234,
"lng": -71.2345
}
},
"website": "www.johndoe.com"
}
```
*I'm using JSON as an example here, but schemind is essentially protocol-agnostic. I use it with MessagePack.*
If you desperately need to make this message more compact, you could alternatively serialize it like so:
```json
[
1,
"John Doe",
"johndoe@example.com",
"1973-01-22",
[
"123 Main Street",
"Anytown",
"12345-6789",
[
42.1234,
-71.2345
]
],
"www.johndoe.com"
]
```
This is sometimes referred to as a message with *indexed keys*.
**Schemind** helps you create and read such messages, if your (de)serializer doesn't support this technique.
*Note that this format obviously has some drawbacks: [recommended reading about the pros and cons](https://github.com/MessagePack-CSharp/MessagePack-CSharp#use-indexed-keys-instead-of-string-keys-contractless).*
## Installation
```shell
npm install schemind
```
## Usage
### Defining a schema
```typescript
import { buildSchema, withIndex as i } from "schemind";
const personSchema = buildSchema({
id: i(0)<number>(),
fullName: i(1)<string>(),
email: i(2)<string>(),
birthDate: i(3)<Date>(),
address: i(4)({
street: i(0)<string>(),
city: i(1)<string>(),
zipcode: i(2)<string>(),
geo: i(3)({
lat: i(0)<number>(),
lng: i(1)<number>()
})
}),
website: i(5)<string>()
});
```
Every field needs to have its index in the message specified using `withIndex`.
Note that this also goes for nested objects, such as `address`.
* If you accidentally pass the same index twice, or you forget to call `withIndex` on any nested object, `buildSchema` will throw an `InvalidSchemaError`.
* If you forget to call `buildSchema` on your object, you'll get a type error when trying to use your schema.
### Reading from a message
Say you have an incoming message (from network/storage/whatever) like this:
```typescript
const incomingMessage = JSON.parse(`
[
1,
"John Doe",
"johndoe@example.com",
"1973-01-22",
[
"123 Main Street",
"Anytown",
"12345-6789",
[
42.1234,
-71.2345
]
],
"www.johndoe.com"
]`);
```
There are 2 ways to read this message:
#### • `toPlainObject`
This is the more convenient option.
```typescript
import { toPlainObject } from "schemind";
const messageAsObject = toPlainObject(incomingMessage, personSchema);
// ^ this has the following type:
// {
// id: number,
// fullName: string,
// email: string,
// birthDate: Date,
// address: {
// street: string,
// city: string,
// zipcode: string,
// geo: {
// lat: number,
// lng: number
// }
// },
// website: string
// }
```
#### • `get`
This is the more performant option – it doesn't allocate on the heap.
```typescript
const fullName = get(incomingMessage, personSchema.fullName);
// ^ this is of type string
const latitude = get(incomingMessage, personSchema.address.geo.lat);
// ^ this is of type number
```
Alternatively, you can use the method "get". It works in the exact same way.
```typescript
const fullName = personSchema.fullName.get(incomingMessage);
const latitude = personSchema.address.geo.lat.get(incomingMessage);
```
### Writing
There are 2 ways to write a message.
#### • `toIndexedKeysMessage`
```typescript
import { toIndexedKeysMessage } from "schemind";
const objectToSerialize = {
id: 1,
fullName: "John Doe",
email: "johndoe@example.com",
birthDate: new Date(),
address: {
street: "123 Main Street",
city: "Anytown",
zipcode: "12345-6789",
geo: {
lat: 42.1234,
lng: -71.2345
}
},
website: "www.johndoe.com"
};
const message = toIndexedKeysMessage(objectToSerialize, personSchema);
// ^ this is an array that's the same as the "incomingMessage" in the previous section
// JSON.stringify(message) or whatever
```
#### • `set`
```typescript
import { set } from "schemind";
const newMessage: unknown[] = [];
set(newMessage, personSchema.fullName, "John Doe");
set(newMessage, personSchema.address.geo.lat, 42.1234);
// ^ this is type-checked
// etc
```
Alternatively, you can use the method "set". It works in the exact same way.
```typescript
personSchema.fullName.set(newMessage, "John Doe");
personSchema.address.geo.lat.set(newMessage, 42.1234);
```
## FAQ
### Shouldn't this be an extension of a serializer?
Probably.
### Wouldn't it be better to use protobuf at this point?
Possibly. But if you're already using JSON / MessagePack / CBOR etc. in your app, and you need more compact messages for some features — *schemind* could be useful.
Additionally, in some languages (backend or frontend) there's a MessagePack or JSON implementation that's faster, or allocates less memory, than protobuf.
### Why would I use `get` if it's inconvenient?
The `get` function prioritizes performance over convenience. The main goal here is to avoid any heap allocations (beyond what your deserializer allocates). I use *schemind* in performance-critical scenarios, where avoiding GC pauses is crucial.
Use the `toPlainObject` function instead, if you don't mind some extra allocations.
## Related work
* [MessagePack-CSharp (.NET)](https://github.com/MessagePack-CSharp/MessagePack-CSharp#use-indexed-keys-instead-of-string-keys-contractless)
* [Nerdbank.MessagePack (.NET)](https://aarnott.github.io/Nerdbank.MessagePack/docs/customizing-serialization.html?q=indexed#serialize-objects-with-indexes-for-keys)
* [Idein/msgpack-schema (Rust)](https://github.com/Idein/msgpack-schema)
* [serde (Rust)](https://github.com/serde-rs/serde/issues/959)