UNPKG

datocms-contentful-to-structured-text

Version:

Convert Contentful Rich Text to a valid DatoCMS Structured Text `dast` document

338 lines (265 loc) 9.56 kB
# `datocms-contentful-to-structured-text` This package contains utilities to convert Contentful Rich Text to a DatoCMS Structured Text `dast` (DatoCMS Abstract Syntax Tree) document. Please refer to [the `dast` format docs](https://www.datocms.com/docs/structured-text/dast) to learn more about the syntax tree format and the available nodes. ## Usage The main utility in this package is `richTextToStructuredText` which takes a Rich Text JSON and transforms it into a valid `dast` document. `richTextToStructuredText` returns a `Promise` that resolves with a Structured Text document. ```js import { richTextToStructuredText } from 'datocms-contentful-to-structured-text'; const richText = { nodeType: 'document', data: {}, content: [ { nodeType: 'heading-1', content: [ { nodeType: 'text', value: 'Lorem ipsum dolor sit amet', marks: [], data: {}, }, ], data: {}, }, }; richTextToStructuredText(richText).then((structuredText) => { console.log(structuredText); }); ``` ## Validate `dast` documents `dast` is a strict format for DatoCMS' Structured Text fields and follows a different pattern from Contentful Rich Text structure. The `datocms-structured-text-utils` package provides a `validate` utility to validate a Structured Text content to make sure that it is compatible with DatoCMS' Structured Text field. ```js import { validate } from 'datocms-structured-text-utils'; // ... richTextToStructuredText(richText).then((structuredText) => { const { valid, message } = validate(structuredText); if (!valid) { throw new Error(message); } }); ``` We recommend to validate every `dast` to avoid errors later when creating records. ## Advanced Usage ### Options All the `*ToStructuredText` utils accept an optional `options` object as second argument: ```js type Options = Partial<{ // Override existing Contentful node handlers or add new ones. handlers: Record<string, CreateNodeFunction>, // Array of allowed Block nodes. allowedBlocks: Array< BlockquoteType | CodeType | HeadingType | LinkType | ListType, >, // Array of allowed marks. allowedMarks: Mark[], }>; ``` ### Transforming Nodes The utils in this library traverse a `Contentful Rich Text` tree and transform supported nodes to `dast` nodes. The transformation is done by working on a `Contentful Rich Text` node with a handler (async) function. Handlers are associated to `Contentful Rich Text` nodes by `nodeType` and look as follow: ```js import { visitChildren } from 'datocms-contentful-to-structured-text'; // Handler for the paragraph node type. async function p(createDastNode, contentfulNode, context) { return createDastNode('paragraph', { children: await visitChildren(createDastNode, contentfulNode, context), }); } ``` Handlers can return either a promise that resolves to a `dast` node, an array of `dast` Nodes, or `undefined` to skip the current node. To ensure that a valid `dast` is generated, the default handlers also check that the current `contentfulNode` is a valid `dast` node for its parent and, if not, they ignore the current node and continue visiting its children. Information about the parent `dast` node name is available in `context.parentNodeType`. Please take a look at the [default handlers implementation](./handlers.ts) for examples. The default handlers are available on `context.defaultHandlers`. ### Context Every handler receives a `context` object containing the following information: ```js export interface Context { // The current parent `dast` node type. parentNodeType: NodeType; // The parent `Contentful Rich Text` node. parentNode: ContentfulNode; // A reference to the current handlers - merged default + user handlers. handlers: Record<string, Handler<unknown>>; // A reference to the default handlers record (map). defaultHandlers: Record<string, Handler<unknown>>; // Marks for span nodes. marks?: Mark[]; // Array of allowed Block types. allowedBlocks: Array< BlockquoteType | CodeType | HeadingType | LinkType | ListType, >; // Array of allowed marks. allowedMarks: Mark[]; } ``` ### Custom Handlers It is possible to register custom handlers and override the default behaviour via options, using the `makeHandler` function. For example, to create a custom handler for the Contentful `text` element, specify a guard clause to specify the correct type ```ts import { makeHandler } from 'datocms-contentful-to-structured-text'; const customTextHandler = makeHandler( (node): node is Text => n.nodeType === "text", async (node) => { // This custom handler will generate two spans for every text node, instead of one. return [ { type: 'span', value: node.value }, { type: 'span', value: node.value }, ]; }), richTextToStructuredText(richText, { handlers: [ customTextHandler, ], }).then((structuredText) => { console.log(structuredText); }); ``` ```js import { paragraphHandler } from './customHandlers'; richTextToStructuredText(richText, { handlers: { paragraph: paragraphHandler, }, }).then((structuredText) => { console.log(structuredText); }); ``` It is **highly encouraged** to validate the `dast` when using custom handlers because handlers are responsible for dictating valid parent-children relationships and therefore generating a tree that is compliant with DatoCMS Structured Text. ## Preprocessing Rich Text Because of the strictness of the `dast` spec, it is possible that some elements might be lost during transformation. To improve the final result, you might want to modify the Rich Text tree before it is transformed to `dast`. ### Examples <details> <summary>Split a node that contains an image.</summary> In `dast`, images can only be presented as `Block` nodes, but blocks are not allowed inside of `ListItem` nodes (unordered-list/ordered-list). In this example we will split the original `unordered-list` in one list, the lifted up image block and another list. ```js import { liftAssets } from 'datocms-contentful-to-structured-text'; const richTextWithAssets = { nodeType: 'document', data: {}, content: [ { nodeType: 'unordered-list', content: [ { nodeType: 'list-item', content: [ { nodeType: 'paragraph', content: [ { nodeType: 'text', value: 'text', marks: [], data: {}, }, ], data: {}, }, { content: [], data: { target: { sys: { id: 'zzz', linkType: 'Asset', type: 'Link', }, }, }, nodeType: 'embedded-asset-block', }, { nodeType: 'paragraph', content: [ { nodeType: 'text', value: 'text', marks: [], data: {}, }, ], data: {}, }, ], data: {}, }, ], data: {}, }, ], }; liftAssets(richTextWithAssets); const handlers = { 'embedded-asset-block': async (createNode, node, context) => { const item = '123'; return createNode('block', { item, }); }, }; const dast = await richTextToStructuredText(richTextWithAssets, { handlers }); ``` The liftAssets function transforms the richText tree and moves the embedded-asset-block to root,splitting the list in two parts. ```js function liftAssets(richText) { const visit = (node, cb, index = 0, parents = []) => { if (node.content && node.content.length > 0) { node.content.forEach((child, index) => { visit(child, cb, index, [...parents, node]); }); } cb(node, index, parents); }; const liftedImages = new WeakSet(); visit(richText, (node, index, parents) => { if ( !node || node.nodeType !== 'embedded-asset-block' || liftedImages.has(node) || parents.length === 1 // is a top level asset ) { return; } const imgParent = parents[parents.length - 1]; imgParent.content.splice(index, 1); let i = parents.length; let splitChildrenIndex = index; const contentAfterSplitPoint = []; while (--i > 0) { const parent = parents[i]; const parentsParent = parents[i - 1]; contentAfterSplitPoint = parent.content.splice(splitChildrenIndex); splitChildrenIndex = parentsParent.content.indexOf(parent); let nodeInserted = false; if (i === 1) { splitChildrenIndex += 1; parentsParent.content.splice(splitChildrenIndex, 0, node); liftedImages.add(node); nodeInserted = true; } splitChildrenIndex += 1; if (contentAfterSplitPoint.length > 0) { parentsParent.content.splice(splitChildrenIndex, 0, { ...parent, content: contentAfterSplitPoint, }); } // Remove the parent if empty if (parent.content.length === 0) { splitChildrenIndex -= 1; parentsParent.content.splice( nodeInserted ? splitChildrenIndex - 1 : splitChildrenIndex, 1, ); } } }); } ``` </details> ## License MIT