afpp
Version:
another f*cking pdf parser
109 lines (77 loc) β’ 3.24 kB
Markdown


[](https://codecov.io/github/l2ysho/afpp)




Another f\*cking PDF parser. Because parsing PDFs in Node.js should be easy. Live long and parse PDFs. π
There are plenty of PDF-related packages for Node.js. They workβ¦ until they donβt.
Afpp was built to solve the headaches I ran into while trying to parse PDFs in Node.js:
- π¦ Do I need a package with 30+ MB just to read a PDF?
- π§΅ Why is the event loop blocked?
- π Is that a memory leak I smell?
- π Should reading a PDF really be this performance-heavy?
- π Why is everything so buggy?
- π¨ Why does it complain about the lack of a canvas in Node.js?
- π§± Why does canvas require native C++/Python dependencies to build?
- πͺ Why does it complain about the missing window object?
- πͺ Why do I need ImageMagick for this?!
- π» What the hell is Ghostscript, and why does it keep failing?
- β Whereβs the TypeScript support?
- π§ Why are the dependencies older than my dev career?
- π Why does everything workβ¦ until I try an encrypted PDF?
- π―οΈ Why does every OS need its own special setup ritual?
- Node.js >= v22.14.0
You can install `afpp` via npm, Yarn, or pnpm.
```bash
npm install afpp
```
```bash
yarn add afpp
```
```bash
pnpm add afpp
```
The `afpp` library makes it simple to extract text or images from PDF files in Node.js. Whether your PDF is stored locally, hosted online, or encrypted, `afpp` provides an easy-to-use API to handle it all. All functions have common parameters and accepts string path, buffer, or URL object.
```ts
import { readFile } from 'fs/promises';
import path from 'path';
import { pdf2string } from 'afpp';
(async function main() {
const pathToFile = path.join('..', 'test', 'example.pdf');
const input = await readFile(pathToFile);
const data = await pdf2string(input);
console.log('Extracted text:', data); // ['page 1 content', 'page 2 content', ...]
})();
```
```ts
import { pdf2image } from 'afpp';
(async function main() {
const url = new URL('https://pdfobject.com/pdf/sample.pdf');
const arrayOfImages = await pdf2image(url);
console.log(arrayOfImages); // [imageBuffer, imageBuffer, ...]
})();
```
```ts
import { parsePdf } from 'afpp';
(async function main() {
// Download PDF from URL
const response = await fetch('https://pdfobject.com/pdf/sample.pdf');
const buffer = Buffer.from(await response.arrayBuffer());
// Parse the PDF buffer
const result = await parsePdf(buffer, {}, (content) => content);
console.log('Parsed PDF:', result);
})();
```