crawling

Version:

A simple crawler made in JavaScript for Node.

77 lines (48 loc) • 1.68 kB

Markdown

# crawling A simple crawler made in JavaScript for Node. ## Installation `crawling` is both available on GitHub Packages and npm. ### How to install from GitHub Packages To install, you first have to follow [this guide](https://docs.github.com/en/packages/working-with-a-github-packages-registry/working-with-the-npm-registry#installing-a-package) on GitHub Docs. Then, you can run: ```bash $ npm install crawling ``` This should install the package in your project. ### How to install from npm You only need to run one command: ```bash $ npm install crawling ``` This should install the package in your project. ## Usage ### Creating an array with all of the links This example will create an array with all of the links gathered from the page. ```javascript import { crawlSite } from "crawling"; const links = []; for await (const url of crawlSite("https://github.com/", 500)) { links.push(url); } ``` ### Log each one of the links This example will log each one of the links received, without a delay like the previous example had. ```javascript import { crawlSite } from "crawling"; for await (const url of crawlSite("https://github.com/", 500)) { console.log(url); } ``` ### Documentation The function `crawlSite` takes two parameters: - `site`: Required. The site to crawl. - `timeout`: Optional. The timeout between each link in miliseconds, default is 500. There are examples of usage, [above](#usage) and below: ```javascript import { crawlSite } from "crawling"; // this should choose a random url const links = []; for (const url of await crawlSite("https://github.com/", 500)) { links.push(url); } console.log(shuffle(links)[0]); ```