sqrap
Version:
sqrap is a configurable web scraper that can map information from a website using a json schema.
29 lines (28 loc) • 955 B
HTML
<html>
<head>
<meta property="og:image" content="https://cdn.example.com/someimage" />
<link rel="icon" href="https://cdn.example.com/favicon" class="favicon" />
</head>
<body>
<div class="main-container">
<h1>Title</h1>
<div class="content">
<p>This is a paragraph with a <a href="https://example.com/somelink">link</a>.</p>
</div>
</div>
<div class="extra">
<div class="main-container">
<div class="extra-content"><p>This is extra.</p></div>
<div class="author">
<span class="author-name"> <a href="https://example.com/author/john">John</a> </span>
</div>
</div>
</div>
<div class="wrapper">
<div class="group"><div class="member">One</div></div>
<div class="group"><div class="member">Two</div></div>
<div class="group"><div class="member">Three</div></div>
</div>
<script src="something.js"></script>
</body>
</html>