UNPKG

autolinker

Version:

Utility to automatically link the URLs, email addresses, phone numbers, hashtags, and mentions (Twitter, Instagram) in a given block of text/HTML

59 lines (58 loc) 2.7 kB
/** * Parses an HTML string, calling the callbacks to notify of tags and text. * * ## History * * This file previously used a regular expression to find html tags in the input * text. Unfortunately, we ran into a bunch of catastrophic backtracking issues * with certain input text, causing Autolinker to either hang or just take a * really long time to parse the string. * * The current code is intended to be a O(n) algorithm that walks through * the string in one pass, and tries to be as cheap as possible. We don't need * to implement the full HTML spec, but rather simply determine where the string * looks like an HTML tag, and where it looks like text (so that we can autolink * that). * * This state machine parser is intended just to be a simple but performant * parser of HTML for the subset of requirements we have. We simply need to: * * 1. Determine where HTML tags are * 2. Determine the tag name (Autolinker specifically only cares about <a>, * <script>, and <style> tags, so as not to link any text within them) * * We don't need to: * * 1. Create a parse tree * 2. Auto-close tags with invalid markup * 3. etc. * * The other intention behind this is that we didn't want to add external * dependencies on the Autolinker utility which would increase its size. For * instance, adding htmlparser2 adds 125kb to the minified output file, * increasing its final size from 47kb to 172kb (at the time of writing). It * also doesn't work exactly correctly, treating the string "<3 blah blah blah" * as an HTML tag. * * Reference for HTML spec: * * https://www.w3.org/TR/html51/syntax.html#sec-tokenization * * @param {String} html The HTML to parse * @param {Object} callbacks * @param {Function} callbacks.onOpenTag Callback function to call when an open * tag is parsed. Called with the tagName as its argument. * @param {Function} callbacks.onCloseTag Callback function to call when a close * tag is parsed. Called with the tagName as its argument. If a self-closing * tag is found, `onCloseTag` is called immediately after `onOpenTag`. * @param {Function} callbacks.onText Callback function to call when text (i.e * not an HTML tag) is parsed. Called with the text (string) as its first * argument, and offset (number) into the string as its second. */ export declare function parseHtml(html: string, { onOpenTag, onCloseTag, onText, onComment, onDoctype, }: { onOpenTag: (tagName: string, offset: number) => void; onCloseTag: (tagName: string, offset: number) => void; onText: (text: string, offset: number) => void; onComment: (offset: number) => void; onDoctype: (offset: number) => void; }): void;