stew-select
Version:
CSS selectors that allow regular expressions. Stew is a meatier soup.
299 lines (278 loc) • 20.9 kB
HTML
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta http-equiv="Content-Style-Type" content="text/css" />
<meta name="generator" content="pandoc" />
<title></title>
<style type="text/css">
table.sourceCode, tr.sourceCode, td.lineNumbers, td.sourceCode {
margin: 0; padding: 0; vertical-align: baseline; border: none; }
table.sourceCode { width: 100%; }
td.lineNumbers { text-align: right; padding-right: 4px; padding-left: 4px; color: #aaaaaa; border-right: 1px solid #aaaaaa; }
td.sourceCode { padding-left: 5px; }
code > span.kw { color: #007020; font-weight: bold; }
code > span.dt { color: #902000; }
code > span.dv { color: #40a070; }
code > span.bn { color: #40a070; }
code > span.fl { color: #40a070; }
code > span.ch { color: #4070a0; }
code > span.st { color: #4070a0; }
code > span.co { color: #60a0b0; font-style: italic; }
code > span.ot { color: #007020; }
code > span.al { color: #ff0000; font-weight: bold; }
code > span.fu { color: #06287e; }
code > span.er { color: #ff0000; font-weight: bold; }
</style>
<!-- /* a stylesheet to include in our *.md-based html. */ -->
<!-- /* please leave the begin and end style tags, they let us include the text of this file "inline" in html documents */-->
<style>
#TOC { font-family: 'droid sans',helvetica,sans serif; font-size: 0.8em; position: fixed; right: 0em; top: 0em; background: #e5e5ee; -webkit-box-shadow: 0 0 1em #777777; -moz-box-shadow: 0 0 1em #777777; -webkit-border-bottom-left-radius: 5px; -moz-border-radius-bottomleft: 5px; text-align: left; max-height: 80%; z-index: 200; width: 7em; white-space:nowrap; overflow:hidden; padding-top: 3em; opacity: 0.9; }
#TOC:before { content:"Contents"; font-weight: bold; text-align:right; align:right; display:block; position:fixed; right: 1.5em; top: 1em; background: #e5e5ee; opacity:0.9; }
#TOC:hover { width: auto; padding-right:2em; max-width:80%; overflow:auto ; opacity:1.0; }
#TOC ul { margin: 0 0 0 1em; padding: 0; }
#TOC li { padding: 0; margin: 1px; list-style: none; overflow:hidden; text-overflow: ellipsis; }
html { font-size: 100%; overflow-y: scroll; -webkit-text-size-adjust: 100%; -ms-text-size-adjust: 100%; }
body{ color:#444; font-family:Georgia, Palatino, 'Palatino Linotype', Times, 'Times New Roman', serif; font-size:12px; line-height:1.5em; padding:1em; margin:auto; max-width:48em; background:#fefefe; }
a { color: #0645ad; text-decoration:none;}
a:visited { color: #0b0080; }
a:hover { color: #06e; }
a:active { color:#faa700; }
a:focus { outline: thin dotted; }
a:hover, a:active { outline: 0; }
::-moz-selection {background:rgba(255,255,0,0.3);color:#000}
::selection {background:rgba(255,255,0,0.3);color:#000}
a::-moz-selection {background:rgba(255,255,0,0.3);color:#0645ad}
a::selection {background:rgba(255,255,0,0.3);color:#0645ad}
p { margin:1em 0; }
p.caption { font-style: italic; text-align: right; }
img { max-width:100%; }
h1,h2,h3,h4,h5,h6 { font-weight:normal; color:#111; line-height:1em; }
h4,h5,h6{ font-weight: bold; }
h1 { font-size:2.5em; }
h2 { font-size:2em; }
h3 { font-size:1.5em; }
h4 { font-size:1.2em; }
h5 { font-size:1em; }
h6 { font-size:0.9em; }
blockquote{ color:#666666; margin:0; padding-left: 3em; border-left: 0.5em #eee solid; }
hr { display: block; height: 2px; border: 0; border-top: 1px solid #aaa;border-bottom: 1px solid #eee; margin: 1em 0; padding: 0; }
pre, code, kbd, samp { font-family: 'droid sans mono slashed', 'droid sans mono', monospace, monospace; }
pre { padding:2px; background:#333; color:#9e9; border:1px solid #444; overflow:hidden; text-overflow: ellipsis;}
pre:hover { overflow:visible; width: auto; }
pre:hover code { background:#333; }
code { padding:2px; background: #f5f5ff; border:1px solid #e5e5ee; font-size:0.9em; }
code.url { padding:2px; border:none; background:none; font-family:Georgia, Palatino, 'Palatino Linotype', Times, 'Times New Roman', serif; }
pre code { border: none; background:#333; }
b, strong { font-weight: bold; }
dfn { font-style: italic; }
ins { background: #ff9; color: #000; text-decoration: none; }
mark { background: #ff0; color: #000; font-style: italic; font-weight: bold; }
sub, sup { font-size: 75%; line-height: 0; position: relative; vertical-align: baseline; }
sup { top: -0.5em; }
sub { bottom: -0.25em; }
ul, ol { margin: 1em 0; padding: 0 0 0 2em; }
li p:last-child { margin:0 }
dd { margin: 0 0 0 2em; }
img { border: 0; -ms-interpolation-mode: bicubic; vertical-align: middle; }
table { border-collapse: collapse; border-spacing: 0; }
td { vertical-align: top; }
/* TODO: this could use a better color scheme */
code > span.kw { color: #dd7522; font-weight: bold; }
code > span.dt { color: #dd7522; }
code > span.dv { color: #669933; }
code > span.bn { color: #eddd3d; }
code > span.fl { color: #eddd3d; }
code > span.ch { color: #eddd3d; }
code > span.st { color: #669933; }
code > span.co { color: grey; font-style: italic; }
code > span.al { color: #ff0000; font-weight: bold; }
code > span.fu { color: #dd7522; }
code > span.ot { color: #007020; }
code > span.er { color: #ff0000; font-weight: bold; }
@media only screen and (min-width: 480px) { body{font-size:14px;} }
@media only screen and (min-width: 768px) { body{font-size:16px;} }
@media print {
#TOC { display:none; }
* { background: transparent ; color: black ; filter:none ; -ms-filter: none ; }
body{font-size:12pt; max-width:100%;}
a, a:visited { text-decoration: none; }
hr { height: 1px; border:0; border-bottom:1px solid black; }
a[href]:after { content: " (" attr(href) ")"; }
abbr[title]:after { content: " (" attr(title) ")"; }
.ir a:after, a[href^="javascript:"]:after, a[href^="#"]:after { content: ""; }
pre, blockquote { border: 1px solid #999; padding-right: 1em; page-break-inside: avoid; }
pre { font-size: 0.8em; }
tr, img { page-break-inside: avoid; }
img { max-width: 100% ; }
@page :left { margin: 15mm 20mm 15mm 10mm; }
@page :right { margin: 15mm 10mm 15mm 20mm; }
p, h2, h3 { orphans: 3; widows: 3; }
h2, h3 { page-break-after: avoid; }
}
</style>
</head>
<body>
<div id="TOC">
<ul>
<li><a href="#the-stew-api">The Stew API</a><ul>
<li><a href="#installing">Installing</a></li>
<li><a href="#importing">Importing</a></li>
<li><a href="#api">API</a><ul>
<li><a href="#stew">Stew</a><ul>
<li><a href="#stew.selectdomselector">stew.select(dom,selector)</a></li>
<li><a href="#stew.selecthtmlselectorcallback">stew.select(html,selector,callback)</a></li>
<li><a href="#stew.select_firstdomselector">stew.select_first(dom,selector)</a></li>
<li><a href="#stew.select_firsthtmlselectorcallback">stew.select_first(html,selector,callback)</a></li>
</ul></li>
<li><a href="#domutil">DOMUtil</a><ul>
<li><a href="#domutil.parse_htmlhtmlcallback">domutil.parse_html(html,callback)</a></li>
<li><a href="#domutil.to_textnode">domutil.to_text(node)</a></li>
<li><a href="#domutil.to_textnodeaccept">domutil.to_text(node,accept)</a></li>
<li><a href="#domutil.to_htmlnode">domutil.to_html(node)</a></li>
<li><a href="#domutil.inner_htmlnode">domutil.inner_html(node)</a></li>
</ul></li>
</ul></li>
</ul></li>
</ul>
</div>
<h1 id="the-stew-api"><a href="#TOC">The Stew API</a></h1>
<p><em>(<a href="../README.html">Follow this link to go back to the README file.</a>)</em></p>
<p><a href="https://github.com/rodw/stew">Stew</a> is a JavaScript library that extends <a href="http://www.w3.org/TR/CSS2/selector.html">CSS selector</a> with regular expressions.</p>
<p>It is primarily intended to be used in a <a href="http://nodejs.org/">Node.js</a> environment.<sup><a href="#fn1" class="footnoteRef" id="fnref1">1</a></sup></p>
<h2 id="installing"><a href="#TOC">Installing</a></h2>
<p>Stew is deployed as an <a href="https://npmjs.org/">npm module</a> under the name <a href="https://npmjs.org/package/stew-select"><code>stew-select</code></a>. Hence you can install a pre-packaged version with the command:</p>
<pre class="console"><code>npm install stew-select</code></pre>
<p>and you can add it to your project as a dependency by adding a line like:</p>
<pre class="sourceCode javascript"><code class="sourceCode javascript"><span class="st">"stew-select"</span>: <span class="st">"latest"</span></code></pre>
<p>to the <code>dependencies</code> or <code>devDependencies</code> part of your <code>package.json</code> file.</p>
<h2 id="importing"><a href="#TOC">Importing</a></h2>
<p>Stew can be loaded into a Node.js program as follows:</p>
<pre class="sourceCode javascript"><code class="sourceCode javascript"><span class="kw">var</span> Stew = require(<span class="ch">'stew-select'</span>).<span class="fu">Stew</span>;</code></pre>
<p>The Stew type is an instantiable class, hence (if you're content with the default configuration) you might prefer this alternative:</p>
<pre class="sourceCode javascript"><code class="sourceCode javascript"><span class="kw">var</span> stew = <span class="kw">new</span> (require(<span class="ch">'stew-select'</span>)).<span class="fu">Stew</span>();</code></pre>
<p>Stew also exposes a class named DOMUtil, which can be loaded like this:</p>
<pre class="sourceCode javascript"><code class="sourceCode javascript"><span class="kw">var</span> DOMUtil = require(<span class="ch">'stew-select'</span>).<span class="fu">DOMUtil</span>;</code></pre>
<p>or like this:</p>
<pre class="sourceCode javascript"><code class="sourceCode javascript"><span class="kw">var</span> domutil = <span class="kw">new</span> (require(<span class="ch">'stew-select'</span>)).<span class="fu">DOMUtil</span>();</code></pre>
<h2 id="api"><a href="#TOC">API</a></h2>
<h3 id="stew"><a href="#TOC">Stew</a></h3>
<h4 id="stew.selectdomselector"><a href="#TOC">stew.select(dom,selector)</a></h4>
<p>This variation of <code>select</code> accepts a DOM object (generated by <a href="https://github.com/tautologistics/node-htmlparser">node-htmlparser</a>) and string containing CSS selector and returns an array of DOM nodes that match the given selector.</p>
<p>For example:</p>
<pre class="sourceCode javascript"><code class="sourceCode javascript"><span class="kw">var</span> author_links = <span class="kw">stew</span>.<span class="fu">select</span>(dom, <span class="ch">'a[href][rel="author"]'</span>);</code></pre>
<h4 id="stew.selecthtmlselectorcallback"><a href="#TOC">stew.select(html,selector,callback)</a></h4>
<p>This variation of <code>select</code> accepts a string containing HTML, a string containing a CSS selector and a callback method (with the signature <code>callback(err,nodeset)</code>) and passes an array of matching DOM nodes to the callback.</p>
<p>The HTML is parsed using <a href="https://github.com/tautologistics/node-htmlparser">node-htmlparser</a>, if available.</p>
<p>If an error occurs during parsing, it will be passed as the first argument to the callback.</p>
<p>For example:</p>
<pre class="sourceCode javascript"><code class="sourceCode javascript"><span class="kw">stew</span>.<span class="fu">select</span>(dom, <span class="ch">'a[href][rel="author"]'</span>, <span class="kw">function</span>(err,nodeset) {
<span class="kw">if</span>(err) {
<span class="kw">console</span>.<span class="fu">error</span>(err);
} <span class="kw">else</span> {
<span class="kw">console</span>.<span class="fu">log</span>(nodeset);
}
});</code></pre>
<h4 id="stew.select_firstdomselector"><a href="#TOC">stew.select_first(dom,selector)</a></h4>
<p>This variation of <code>select_first</code> accepts a DOM object (generated by <a href="https://github.com/tautologistics/node-htmlparser">node-htmlparser</a>) and string containing CSS selector and returns the <em>first</em> DOM node that matches the selector.</p>
<pre class="sourceCode javascript"><code class="sourceCode javascript"><span class="kw">var</span> title_tag = <span class="kw">stew</span>.<span class="fu">select</span>(dom, <span class="ch">'head title'</span>);</code></pre>
<h4 id="stew.select_firsthtmlselectorcallback"><a href="#TOC">stew.select_first(html,selector,callback)</a></h4>
<p>This variation of <code>select_first</code> accepts a string containing HTML, a string containing a CSS selector and a callback method (with the signature <code>callback(err,node)</code>) and passes the <em>first</em> matching DOM node to the callback.</p>
<p>The HTML is parsed using <a href="https://github.com/tautologistics/node-htmlparser">node-htmlparser</a>, if available.</p>
<p>If an error occurs during parsing, it will be passed as the first argument to the callback.</p>
<p>For example:</p>
<pre class="sourceCode javascript"><code class="sourceCode javascript"><span class="kw">stew</span>.<span class="fu">select</span>_<span class="fu">first</span>(dom, <span class="ch">'html title'</span>, <span class="kw">function</span>(err,title_tag) {
<span class="kw">if</span>(err) {
<span class="kw">console</span>.<span class="fu">error</span>(err);
} <span class="kw">else</span> {
<span class="kw">console</span>.<span class="fu">log</span>(<span class="kw">domutil</span>.<span class="fu">to</span>_<span class="fu">text</span>(title_tag));
}
});</code></pre>
<h3 id="domutil"><a href="#TOC">DOMUtil</a></h3>
<h4 id="domutil.parse_htmlhtmlcallback"><a href="#TOC">domutil.parse_html(html,callback)</a></h4>
<p><code>parse_html</code> accepts a string of HTML and a callback method (with the signature <code>callback(err,node)</code>). The HTML is parsed and the corresponding DOM node will be passed to the callback function.</p>
<p>If <code>html</code> contains more than one "root" node, an array of DOM nodes will be passed to the callback function.</p>
<p>The HTML is parsed using <a href="https://github.com/tautologistics/node-htmlparser">node-htmlparser</a>, if available.</p>
<p>If an error occurs during parsing, it will be passed as the first argument to the callback.</p>
<p>For example, the JavaScript snippet:</p>
<pre class="sourceCode javascript"><code class="sourceCode javascript"><span class="kw">var</span> html = <span class="ch">'<div>First doc</div> <span><i>Second</i> doc</span>'</span>;
<span class="kw">domutil</span>.<span class="fu">parse</span>_<span class="fu">html</span>( html, <span class="kw">function</span>(err,dom) {
<span class="kw">if</span>(err) {
<span class="kw">console</span>.<span class="fu">error</span>(err);
} <span class="kw">else</span> {
<span class="kw">console</span>.<span class="fu">log</span>(<span class="kw">dom</span>.<span class="fu">length</span>);
}
});</code></pre>
<p>will output <code>2</code>, and</p>
<pre class="sourceCode javascript"><code class="sourceCode javascript"><span class="kw">var</span> html = <span class="ch">'<body>Only doc</body>'</span>;
<span class="kw">domutil</span>.<span class="fu">parse</span>_<span class="fu">html</span>( html, <span class="kw">function</span>(err,dom) {
<span class="kw">if</span>(err) {
<span class="kw">console</span>.<span class="fu">error</span>(err);
} <span class="kw">else</span> {
<span class="kw">console</span>.<span class="fu">log</span>(<span class="kw">dom</span>.<span class="fu">name</span>);
}
});</code></pre>
<p>will output <code>body</code>.</p>
<h4 id="domutil.to_textnode"><a href="#TOC">domutil.to_text(node)</a></h4>
<p><code>to_text</code> accepts a DOM node and returns a string containing the text content of node or node's descendants.</p>
<p>For example, the JavaScript snippet:</p>
<pre class="sourceCode javascript"><code class="sourceCode javascript"><span class="kw">var</span> html = <span class="ch">'<span>This example has <b>bold</b> and <i>italic</i> text.</span>'</span>;
<span class="kw">domutil</span>.<span class="fu">parse</span>_<span class="fu">html</span>( html, <span class="kw">function</span>(err,dom) {
<span class="kw">if</span>(err) {
<span class="kw">console</span>.<span class="fu">error</span>(err);
} <span class="kw">else</span> {
<span class="kw">console</span>.<span class="fu">log</span>(<span class="kw">domutil</span>.<span class="fu">to</span>_<span class="fu">text</span>(dom));
}
});</code></pre>
<p>will print:</p>
<pre><code>This example has bold and italic text.</code></pre>
<h4 id="domutil.to_textnodeaccept"><a href="#TOC">domutil.to_text(node,accept)</a></h4>
<p>This variant of <code>to_text</code> accepts a DOM node and boolean valued filter (with the signature <code>accept(node)</code>) and returns a string containing the text content any of node or node's descendants for which <code>accept(node)</code> returns <code>true</code></p>
<p>For example, the JavaScript snippet:</p>
<pre class="sourceCode javascript"><code class="sourceCode javascript"><span class="kw">var</span> html = <span class="ch">'<span>This example has <b>bold</b> and <i>italic</i> text.</span>'</span>;
<span class="kw">var</span> not_italic = <span class="kw">function</span>(node) {
<span class="kw">return</span> <span class="kw">node</span>.<span class="fu">type</span> != <span class="ch">'tag'</span> || <span class="kw">node</span>.<span class="fu">name</span> != <span class="ch">'i'</span>;\
}
<span class="kw">domutil</span>.<span class="fu">parse</span>_<span class="fu">html</span>( html, <span class="kw">function</span>(err,dom) {
<span class="kw">if</span>(err) {
<span class="kw">console</span>.<span class="fu">error</span>(err);
} <span class="kw">else</span> {
<span class="kw">console</span>.<span class="fu">log</span>(<span class="kw">domutil</span>.<span class="fu">to</span>_<span class="fu">text</span>(dom,not_italic));
}
});</code></pre>
<p>will print:</p>
<pre><code>This example has bold and text.</code></pre>
<h4 id="domutil.to_htmlnode"><a href="#TOC">domutil.to_html(node)</a></h4>
<p><code>to_html</code> accepts a DOM node and returns a string containing an HTML representation of the node and its children.</p>
<p>For example, the JavaScript snippet:</p>
<pre class="sourceCode javascript"><code class="sourceCode javascript"><span class="kw">var</span> html = <span class="ch">'<span>This example has <b>bold</b> and <i>italic</i> text.</span>'</span>;
<span class="kw">domutil</span>.<span class="fu">parse</span>_<span class="fu">html</span>( html, <span class="kw">function</span>(err,dom) {
<span class="kw">if</span>(err) {
<span class="kw">console</span>.<span class="fu">error</span>(err);
} <span class="kw">else</span> {
<span class="kw">console</span>.<span class="fu">log</span>(<span class="kw">domutil</span>.<span class="fu">to</span>_<span class="fu">html</span>(dom));
}
});</code></pre>
<p>will print:</p>
<pre><code><span>This example has <b>bold</b> and <i>italic</i> text.</span></code></pre>
<h4 id="domutil.inner_htmlnode"><a href="#TOC">domutil.inner_html(node)</a></h4>
<p><code>inner_html</code> accepts a DOM node and returns a string containing an HTML representation of node's children.</p>
<p>For example, the JavaScript snippet:</p>
<pre class="sourceCode javascript"><code class="sourceCode javascript"><span class="kw">var</span> html = <span class="ch">'<span>This example has <b>bold</b> and <i>italic</i> text.</span>'</span>;
<span class="kw">domutil</span>.<span class="fu">parse</span>_<span class="fu">html</span>( html, <span class="kw">function</span>(err,dom) {
<span class="kw">if</span>(err) {
<span class="kw">console</span>.<span class="fu">error</span>(err);
} <span class="kw">else</span> {
<span class="kw">console</span>.<span class="fu">log</span>(<span class="kw">domutil</span>.<span class="fu">inner</span>_<span class="fu">html</span>(dom));
}
});</code></pre>
<p>will print:</p>
<pre><code>This example has <b>bold</b> and <i>italic</i> text.</code></pre>
<div class="footnotes">
<hr />
<ol>
<li id="fn1"><p>Although it probably wouldn't be difficult to make Stew work in a browser context, we haven't had any need for that, and so we haven't (yet) attempted to do it. Drop us a <a href="https://github.com/rodw/stew/issues">note</a> if this is something you'd like to see Stew support.<a href="#fnref1">↩</a></p></li>
</ol>
</div>
</body>
</html>