e2pdf
Version:
A lightweight, highly efficient, and customizable Node.js library for crawling websites and converting pages into compact, AI-optimized PDFs. Ideal for data archiving, offline analysis, and feeding content to AI tools. Delivers fast performance and allows
130 lines (83 loc) • 4.56 kB
Markdown
# Export Website to PDF <img src="https://raw.githubusercontent.com/mayank1513/mayank1513/main/popper.png" style="height: 40px"/>
[](https://github.com/mayank1513/e2pdf/actions/workflows/test.yml) [](https://codeclimate.com/github/mayank1513/e2pdf/maintainability) [](https://codecov.io/gh/mayank1513/e2pdf) [](https://www.npmjs.com/package/e2pdf) [](https://www.npmjs.com/package/e2pdf)  [](https://gitpod.io/from-referrer/)
A tiny, fast, and customizable Node.js library to crawl websites and save all pages as compact, AI-ready PDFs. Use it from the command line or as a module in your Node.js scripts. Perfect for data archiving, offline analysis, and feeding content to AI tools.
## Features
- **Blazing Fast**: Optimized for speed and performance.
- **Lightweight**: Minimal resource usage for crawling and PDF generation.
- **Customizable**: Full control over PDF formatting and crawling behavior.
- **AI-Optimized PDFs**: Compact and structured for AI consumption.
- **Dual Usage**: Use via CLI or integrate into Node.js scripts.
> <img src="https://raw.githubusercontent.com/mayank1513/mayank1513/main/popper.png" style="height: 20px"/> Star [this repository](https://github.com/mayank1513/e2pdf) and share it with your friends.
## Installation
Install using pnpm, npm, or yarn
```bash
pnpm add e2pdf
```
**_or_**
```bash
npm install e2pdf
```
**_or_**
```bash
yarn add e2pdf
```
## Usage
### Command-Line Usage
To use e2pdf from the command line:
```bash
e2pdf <website-url>
```
For example:
```bash
e2pdf https://example.com
```
This will crawl the website and save all pages as PDFs in the current directory.
### Node.js Script Usage
Here’s an example of using e2pdf in a Node.js script:
```javascript
import e2pdf from "e2pdf";
(async () => {
await e2pdf("https://example.com", {
out: "./pdfs",
pdf: {
format: "A4",
printBackground: true,
margin: { top: "20px", bottom: "20px" },
},
crawlerOptions: { maxRequestsPerCrawl: 100 },
});
console.log("Crawling completed! PDFs saved to ./pdfs");
})();
```
## API
The `e2pdf` function accepts two arguments:
1. **startUrl** (string): The URL to start crawling from.
2. **options** (E2PdfOptions): Configuration object for crawling and PDF generation.
### E2PdfOptions
#### `out`
- **Type**: `string`
- **Default**: `process.cwd()`
- Directory to save the generated PDFs.
#### `pdf`
PDF generation options (compatible with [Playwright’s PDF options](https://playwright.dev/docs/api/class-page#page-pdf)):
- `displayHeaderFooter`: Display header and footer. Defaults to `false`.
- `footerTemplate`: HTML template for the footer.
- `format`: Paper format (e.g., `A4`, `Letter`). Defaults to `Letter`.
- `headerTemplate`: HTML template for the header.
- `landscape`: Paper orientation. Defaults to `false`.
- `margin`: Margins for the PDF (`top`, `right`, `bottom`, `left`).
- `printBackground`: Print background graphics. Defaults to `false`.
- ...and many more options for fine-tuning PDFs.
#### `crawlerOptions`
Options for the [Crawlee PlaywrightCrawler](https://crawlee.dev/api/playwright-crawler/class/PlaywrightCrawler).
#### `crawlerConfig`
Configuration for Crawlee’s [Configuration](https://crawlee.dev/api/playwright-crawler/class/Configuration) object.
## Contributing
We welcome contributions! Please fork the repository and submit a pull request.
## License
This library is licensed under the MPL-2.0 open-source license.
## Feedback and Support
If you encounter any issues or have suggestions, please open an issue or contact us. We’d love to hear from you!
> <img src="https://raw.githubusercontent.com/mayank1513/mayank1513/main/popper.png" style="height: 20px"/> Please enroll in [our courses](https://mayank-chaudhari.vercel.app/courses) or [sponsor](https://github.com/sponsors/mayank1513) our work.
<hr />
<p align="center" style="text-align:center">with 💖 by <a href="https://mayank-chaudhari.vercel.app" target="_blank">Mayank Kumar Chaudhari</a></p>