UNPKG

chromiumly

Version:

A lightweight Typescript library that interacts with Gotenberg's different modules to convert a variety of document formats to PDF files.

521 lines (403 loc) 18.7 kB
<p align="center"> <img src="assets/logo.png" alt="Chromiumly Logo" width="128" height="auto"> </p> ![build](https://github.com/cherfia/chromiumly/actions/workflows/build.yml/badge.svg) [![coverage](https://img.shields.io/codecov/c/gh/cherfia/chromiumly?style=flat-square)](https://codecov.io/gh/cherfia/chromiumly) [![vulnerabilities](https://snyk.io/test/github/cherfia/chromiumly/badge.svg?targetFile=package.json&color=brightgreen&style=flat-square)](https://snyk.io/test/github/cherfia/chromiumly?targetFile=package.json) [![maintainability](https://img.shields.io/codeclimate/maintainability/cherfia/chromiumly?color=yellow&style=flat-square)](https://codeclimate.com/github/cherfia/chromiumly/maintainability) [![npm](https://img.shields.io/npm/v/chromiumly?color=brightgreen&style=flat-square)](https://npmjs.org/package/chromiumly) [![downloads](https://img.shields.io/npm/dt/chromiumly.svg?color=brightgreen&style=flat-square)](https://npm-stat.com/charts.html?package=chromiumly) ![licence](https://img.shields.io/github/license/cherfia/chromiumly?style=flat-square) A lightweight Typescript library that interacts with [Gotenberg](https://gotenberg.dev/)'s different routes to convert a variety of document formats to PDF files. # Table of Contents 1. [Getting Started](#getting-started) - [Installation](#installation) - [Prerequisites](#prerequisites) - [Configuration](#configuration) 2. [Authentication](#authentication) - [Basic Authentication](#basic-authentication) - [Advanced Authentication](#advanced-authentication) 3. [Core Features](#core-features) - [Chromium](#chromium) - [URL](#url) - [HTML](#html) - [Markdown](#markdown) - [Screenshot](#screenshot) - [LibreOffice](#libreoffice) - [PDF Engines](#pdf-engines) - [Format Conversion](#format-conversion) - [Merging](#merging) - [Metadata Management](#metadata-management) - [File Generation](#file-generation) - [PDF Splitting](#pdf-splitting) 4. [Usage Example](#snippet) ## Getting Started ### Installation Using npm: ```bash npm install chromiumly ``` Using yarn: ```bash yarn add chromiumly ``` ### Prerequisites Before attempting to use Chromiumly, be sure you install [Docker](https://www.docker.com/) if you have not already done so. After that, you can start a default Docker container of [Gotenberg](https://gotenberg.dev/) as follows: ```bash docker run --rm -p 3000:3000 gotenberg/gotenberg:8 ``` ### Configuration Chromiumly supports configurations via both [dotenv](https://www.npmjs.com/package/dotenv) and [config](https://www.npmjs.com/package/config) configuration libraries or directly via code to add Gotenberg endpoint to your project. #### dotenv ```bash GOTENBERG_ENDPOINT=http://localhost:3000 ``` #### config ```json { "gotenberg": { "endpoint": "http://localhost:3000" } } ``` #### code ```typescript import { Chromiumly } from "chromiumly"; Chromiumly.configure({ endpoint: "http://localhost:3000" }); ``` ## Authentication ### Basic Authentication Gotenberg introduces basic authentication support starting from version [8.4.0](https://github.com/gotenberg/gotenberg/releases/tag/v8.4.0). Suppose you are running a Docker container using the command below: ```bash docker run --rm -p 3000:3000 \ -e GOTENBERG_API_BASIC_AUTH_USERNAME=user \ -e GOTENBERG_API_BASIC_AUTH_PASSWORD=pass \ gotenberg/gotenberg:8.4.0 gotenberg --api-enable-basic-auth ``` To integrate this setup with Chromiumly, you need to update your configuration as outlined below: ```bash GOTENBERG_ENDPOINT=http://localhost:3000 GOTENBERG_API_BASIC_AUTH_USERNAME=user GOTENBERG_API_BASIC_AUTH_PASSWORD=pass ``` Or ```json { "gotenberg": { "endpoint": "http://localhost:3000", "api": { "basicAuth": { "username": "user", "password": "pass" } } } } ``` Or ```typescript Chromiumly.configure({ endpoint: "http://localhost:3000", username: "user", password: "pass", }); ``` ### Advanced Authentication To implement advanced authentication or add custom HTTP headers to your requests, you can use the `customHttpHeaders` option within the `configure` method. This allows you to pass additional headers, such as authentication tokens or custom metadata, with each API call. For example, you can include a Bearer token for authentication along with a custom header as follows: ```typescript const token = await generateToken(); Chromiumly.configure({ endpoint: "http://localhost:3000", customHttpHeaders: { Authorization: `Bearer ${token}`, "X-Custom-Header": "value", }, }); ``` ## Core Features Chromiumly introduces different classes that serve as wrappers to Gotenberg's [routes](https://gotenberg.dev/docs/routes). These classes encompass methods featuring an input file parameter, such as `html`, `header`, `footer`, and `markdown`, capable of accepting inputs in the form of a `string` (i.e. file path), `Buffer`, or `ReadStream`. ### Chromium There are three different classes that come with a single method (i.e.`convert`) which calls one of Chromium's [Conversion routes](https://gotenberg.dev/docs/routes#convert-with-chromium) to convert `html` and `markdown` files, or a `url` to a `buffer` which contains the converted PDF file content. Similarly, a new set of classes have been added to harness the recently introduced Gotenberg [Screenshot routes](https://gotenberg.dev/docs/routes#screenshots-route). These classes include a single method called `capture`, which allows capturing full-page screenshots of `html`, `markdown`, and `url`. #### URL ```typescript import { UrlConverter } from "chromiumly"; const urlConverter = new UrlConverter(); const buffer = await urlConverter.convert({ url: "https://www.example.com/", }); ``` ```typescript import { UrlScreenshot } from "chromiumly"; const screenshot = new UrlScreenshot(); const buffer = await screenshot.capture({ url: "https://www.example.com/", }); ``` #### HTML The only requirement is that the file name should be `index.html`. ```typescript import { HtmlConverter } from "chromiumly"; const htmlConverter = new HtmlConverter(); const buffer = await htmlConverter.convert({ html: "path/to/index.html", }); ``` ```typescript import { HtmlScreenshot } from "chromiumly"; const screenshot = new HtmlScreenshot(); const buffer = await screenshot.capture({ html: "path/to/index.html", }); ``` #### Markdown This route accepts an `index.html` file plus a markdown file. ```typescript import { MarkdownConverter } from "chromiumly"; const markdownConverter = new MarkdownConverter(); const buffer = await markdownConverter.convert({ html: "path/to/index.html", markdown: "path/to/file.md", }); ``` ```typescript import { MarkdownScreenshot } from "chromiumly"; const screenshot = new MarkdownScreenshot(); const buffer = await screenshot.capture({ html: "path/to/index.html", markdown: "path/to/file.md", }); ``` Each `convert()` method takes an optional `properties` parameter of the following type which dictates how the PDF generated file will look like. ```typescript type PageProperties = { singlePage?: boolean; // Print the entire content in one single page (default false) size?: { width: number; // Paper width, in inches (default 8.5) height: number; //Paper height, in inches (default 11) }; margins?: { top: number; // Top margin, in inches (default 0.39) bottom: number; // Bottom margin, in inches (default 0.39) left: number; // Left margin, in inches (default 0.39) right: number; // Right margin, in inches (default 0.39) }; preferCssPageSize?: boolean; // Define whether to prefer page size as defined by CSS (default false) printBackground?: boolean; // Print the background graphics (default false) omitBackground?: boolean; // Hide the default white background and allow generating PDFs with transparency (default false) landscape?: boolean; // Set the paper orientation to landscape (default false) scale?: number; // The scale of the page rendering (default 1.0) nativePageRanges?: { from: number; to: number }; // Page ranges to print }; ``` In addition to the `PageProperties` customization options, the `convert()` method also accepts a set of parameters to further enhance the versatility of the conversion process. Here's an overview of the full list of parameters: ```typescript type ConversionOptions = { properties?: PageProperties; // Customize the appearance of the generated PDF pdfFormat?: PdfFormat; // Define the PDF format for the conversion pdfUA?: boolean; // Enable PDF for Universal Access for optimal accessibility (default false) userAgent?: string; // Customize the user agent string sent during conversion header?: PathLikeOrReadStream; // Specify a custom header for the PDF footer?: PathLikeOrReadStream; // Specify a custom footer for the PDF emulatedMediaType?: EmulatedMediaType; // Specify the emulated media type for conversion waitDelay?: string; // Duration (e.g., '5s') to wait when loading an HTML document before conversion waitForExpression?: string; // JavaScript expression to wait before converting an HTML document into PDF extraHttpHeaders?: Record<string, string>; // Include additional HTTP headers in the request failOnHttpStatusCodes?: number[]; // List of HTTP status codes triggering a 409 Conflict response (default [499, 599]) failOnConsoleExceptions?: boolean; // Return a 409 Conflict response if there are exceptions in the Chromium console (default false) skipNetworkIdleEvent?: boolean; // Do not wait for Chromium network to be idle (default true) metadata?: Metadata; // Metadata to be written. cookies?: Cookie[]; // Cookies to be written. downloadFrom?: DownloadFrom; //Download a file from a URL. It must return a Content-Disposition header with a filename parameter. split?: SplitOptions; // Split the PDF file into multiple files. }; ``` #### Screenshot Similarly, the `capture()` method takes an optional `properties` parameter of the specified type, influencing the appearance of the captured screenshot file. ```typescript type ImageProperties = { format: "png" | "jpeg" | "webp"; //The image compression format, either "png", "jpeg" or "webp". quality?: number; // The compression quality from range 0 to 100 (jpeg only). omitBackground?: boolean; // Hide the default white background and allow generating screenshots with transparency. width?: number; // The device screen width in pixels (default 800). height?: number; // The device screen height in pixels (default 600). clip?: boolean; // Define whether to clip the screenshot according to the device dimensions (default false). }; ``` Furthermore, alongside the customization options offered by `ImageProperties`, the `capture()` method accommodates a variety of parameters to expand the versatility of the screenshot process. Below is a comprehensive overview of all parameters available: ```typescript type ScreenshotOptions = { properties?: ImageProperties; header?: PathLikeOrReadStream; footer?: PathLikeOrReadStream; emulatedMediaType?: EmulatedMediaType; waitDelay?: string; // Duration (e.g, '5s') to wait when loading an HTML document before convertion. waitForExpression?: string; // JavaScript's expression to wait before converting an HTML document into PDF until it returns true. extraHttpHeaders?: Record<string, string>; failOnHttpStatusCodes?: number[]; // Return a 409 Conflict response if the HTTP status code is in the list (default [499,599]) failOnConsoleExceptions?: boolean; // Return a 409 Conflict response if there are exceptions in the Chromium console (default false) skipNetworkIdleEvent?: boolean; // Do not wait for Chromium network to be idle (default true) optimizeForSpeed?: boolean; // Define whether to optimize image encoding for speed, not for resulting size. cookies?: Cookie[]; // Cookies to be written. downloadFrom?: DownloadFrom; // Download the file from a specific URL. It must return a Content-Disposition header with a filename parameter. }; ``` ### LibreOffice The `LibreOffice` class comes with a single method `convert`. This method interacts with [LibreOffice](https://gotenberg.dev/docs/routes#convert-with-libreoffice) route to convert different documents to PDF files. You can find the file extensions accepted [here](https://gotenberg.dev/docs/routes#convert-with-libreoffice). ```typescript import { LibreOffice } from "chromiumly"; const buffer = await LibreOffice.convert({ files: [ "path/to/file.docx", "path/to/file.png", { data: xlsxFileBuffer, ext: "xlsx" }, ], }); ``` Similarly to Chromium's route `convert` method, this method takes the following optional parameters : - `properties`: changes how the PDF generated file will look like. It also includes a `password` parameter to open the source file. - `pdfa`: PDF format of the conversion resulting file (i.e. `PDF/A-1a`, `PDF/A-2b`, `PDF/A-3b`). - `pdfUA`: enables PDF for Universal Access for optimal accessibility. - `merge`: merges all the resulting files from the conversion into an individual PDF file. - `metadata`: writes metadata to the generated PDF file. - `lolosslessImageCompression`: allows turning lossless compression on or off to tweak image conversion performance. - `reduceImageResolution`: allows turning on or off image resolution reduction to tweak image conversion performance. - `quality`: specifies the quality of the JPG export. The value ranges from 1 to 100, with higher values producing higher-quality images and larger file sizes. - `maxImageResolution`: specifies if all images will be reduced to the specified DPI value. Possible values are: `75`, `150`, `300`, `600`, and `1200`. - `flatten`: a boolean that, when set to true, flattens the split PDF files, making form fields and annotations uneditable. ### PDF Engines The `PDFEngines` class interacts with Gotenberg's [PDF Engines](https://gotenberg.dev/docs/routes#convert-into-pdfa--pdfua-route) routes to manipulate PDF files. #### Format Conversion This method interacts with [PDF Engines](https://gotenberg.dev/docs/routes#convert-into-pdfa--pdfua-route) convertion route to transform PDF files into the requested PDF/A format and/or PDF/UA. ```typescript import { PDFEngines } from "chromiumly"; const buffer = await PDFEngines.convert({ files: ["path/to/file_1.pdf", "path/to/file_2.pdf"], pdfa: PdfFormat.A_2b, pdfUA: true, }); ``` #### Merging This method interacts with [PDF Engines](https://gotenberg.dev/docs/routes#merge-pdfs-route) merge route which gathers different engines that can manipulate and merge PDF files such as: [PDFtk](https://gitlab.com/pdftk-java/pdftk), [PDFcpu](https://github.com/pdfcpu/pdfcpu), [QPDF](https://github.com/qpdf/qpdf), and [UNO](https://github.com/unoconv/unoconv). ```typescript import { PDFEngines } from "chromiumly"; const buffer = await PDFEngines.merge({ files: ["path/to/file_1.pdf", "path/to/file_2.pdf"], pdfa: PdfFormat.A_2b, pdfUA: true, }); ``` #### Metadata Management ##### readMetadata This method reads metadata from the provided PDF files. ```typescript import { PDFEngines } from "chromiumly"; const metadataBuffer = await PDFEngines.readMetadata([ "path/to/file_1.pdf", "path/to/file_2.pdf", ]); ``` ##### writeMetadata This method writes metadata to the provided PDF files. ```typescript import { PDFEngines } from "chromiumly"; const buffer = await PDFEngines.writeMetadata({ files: [ "path/to/file_1.pdf", "path/to/file_2.pdf", ], metadata: { Author: 'Taha Cherfia', Tite: 'Chromiumly' Keywords: ['pdf', 'html', 'gotenberg'], } }); ``` Please consider referring to [ExifTool](https://exiftool.org/TagNames/XMP.html#pdf) for a comprehensive list of accessible metadata options. #### File Generation It is just a generic complementary method that takes the `buffer` returned by the `convert` method, and a chosen `filename` to generate the PDF file. Please note that all the PDF files can be found `__generated__` folder in the root folder of your project. ### PDF Splitting Each [Chromium](#chromium) and [LibreOffice](#libreoffice) route has a `split` parameter that allows splitting the PDF file into multiple files. The `split` parameter is an object with the following properties: - `mode`: the mode of the split. It can be `pages` or `intervals`. - `span`: the span of the split. It is a string that represents the range of pages to split. - `unify`: a boolean that allows unifying the split files. Only works when `mode` is `pages`. - `flatten`: a boolean that, when set to true, flattens the split PDF files, making form fields and annotations uneditable. ```typescript import { UrlConverter } from "chromiumly"; const buffer = await UrlConverter.convert({ url: "https://www.example.com/", split: { mode: "pages", span: "1-2", unify: true, }, }); ``` On the other hand, PDFEngines' has a `split` method that interacts with [PDF Engines](https://gotenberg.dev/docs/routes#split-pdfs-route) split route which splits PDF files into multiple files. ```typescript import { PDFEngines } from "chromiumly"; const buffer = await PDFEngines.split({ files: ["path/to/file_1.pdf", "path/to/file_2.pdf"], options: { mode: "pages", span: "1-2", unify: true, }, }); ``` > ⚠️ **Note**: Gotenberg does not currently validate the `span` value when `mode` is set to `pages`, as the validation depends on the chosen engine for the split feature. See [PDF Engines module configuration](https://gotenberg.dev/docs/configuration#pdf-engines) for more details. ### PDF Flattening PDF flattening converts interactive elements like forms and annotations into a static PDF. This ensures the document looks the same everywhere and prevents further edits. ```typescript import { PDFEngines } from "chromiumly"; const buffer = await PDFEngines.flatten({ files: ["path/to/file_1.pdf", "path/to/file_2.pdf"], }); ``` ## Snippet The following is a short snippet of how to use the library. ```typescript import { PDFEngines, UrlConverter } from "chromiumly"; async function run() { const urlConverter = new UrlConverter(); const buffer = await urlConverter.convert({ url: "https://gotenberg.dev/", properties: { singlePage: true, size: { width: 8.5, height: 11, }, }, emulatedMediaType: "screen", failOnHttpStatusCodes: [404], failOnConsoleExceptions: true, skipNetworkIdleEvent: false, optimizeForSpeed: true, split: { mode: "pages", span: "1-2", unify: true, }, }); await PDFEngines.generate("gotenberg.pdf", buffer); } run(); ```