domhandler
Version:
Handler for htmlparser2 that turns pages into a dom
87 lines (70 loc) • 2.28 kB
Markdown
The DOM handler creates a tree containing all nodes of a page.
The tree can be manipulated using the [domutils](https://github.com/fb55/domutils)
or [cheerio](https://github.com/cheeriojs/cheerio) libraries and
rendered using [dom-serializer](https://github.com/cheeriojs/dom-serializer) .
```javascript
const handler = new DomHandler([ <func> callback(err, dom), ] [ <obj> options ]);
// const parser = new Parser(handler[, options]);
```
Available options are described below.
```javascript
const { Parser } = require("htmlparser2");
const { DomHandler } = require("domhandler");
const rawHtml =
"Xyz <script language= javascript>var foo = '<<bar>>';</script><!--<!-- Waah! -- -->";
const handler = new DomHandler((error, dom) => {
if (error) {
// Handle error
} else {
// Parsing completed, do something
console.log(dom);
}
});
const parser = new Parser(handler);
parser.write(rawHtml);
parser.end();
```
Output:
```javascript
[
{
data: "Xyz ",
type: "text",
},
{
type: "script",
name: "script",
attribs: {
language: "javascript",
},
children: [
{
data: "var foo = '<bar>';<",
type: "text",
},
],
},
{
data: "<!-- Waah! -- ",
type: "comment",
},
];
```
Add a `startIndex` property to nodes.
When the parser is used in a non-streaming fashion, `startIndex` is an integer
indicating the position of the start of the node in the document.
The default value is `false`.
Add an `endIndex` property to nodes.
When the parser is used in a non-streaming fashion, `endIndex` is an integer
indicating the position of the end of the node in the document.
The default value is `false`.
---
License: BSD-2-Clause
To report a security vulnerability, please use the [Tidelift security contact](https://tidelift.com/security).
Tidelift will coordinate the fix and disclosure.