html-pdf vs pdfkit vs puppeteer
PDF Generation Libraries
html-pdfpdfkitpuppeteerSimilar Packages:

PDF Generation Libraries

PDF generation libraries are essential tools in web development that allow developers to create PDF documents programmatically. These libraries provide various functionalities, such as converting HTML to PDF, creating PDFs from scratch, and rendering web pages as PDFs. They cater to different use cases, from generating invoices and reports to creating complex documents with custom layouts and styles. Choosing the right library depends on the specific requirements of the project, including the complexity of the PDF content, the need for rendering capabilities, and performance considerations.

Npm Package Weekly Downloads Trend

3 Years

Github Stars Ranking

Stat Detail

Package
Downloads
Stars
Size
Issues
Publish
License
html-pdf03,646-4685 years agoMIT
pdfkit010,5966.09 MB4026 months agoMIT
puppeteer093,69563 kB28512 days agoApache-2.0

Feature Comparison: html-pdf vs pdfkit vs puppeteer

PDF Creation Method

  • html-pdf:

    html-pdf focuses on converting existing HTML content into PDF format. It takes an HTML string or file and generates a PDF, making it easy to create documents from web pages or templates without much coding effort.

  • pdfkit:

    pdfkit allows for programmatic PDF creation from scratch. You can define the layout, styles, and content dynamically, making it highly flexible for generating complex documents with various elements like text, images, and shapes.

  • puppeteer:

    puppeteer leverages a headless browser to render web pages and capture them as PDFs. This means it can handle dynamic content, JavaScript, and CSS, ensuring that the generated PDF closely resembles the web page.

Complexity and Customization

  • html-pdf:

    html-pdf is relatively simple to use, making it suitable for quick conversions without extensive customization. However, it may not support advanced features like custom fonts or complex layouts as effectively as other libraries.

  • pdfkit:

    pdfkit offers a high degree of customization, allowing developers to create intricate PDF documents with precise control over every aspect, including text positioning, styling, and graphics. This makes it ideal for applications requiring tailored PDF outputs.

  • puppeteer:

    puppeteer provides advanced capabilities for rendering complex web pages, including support for modern web technologies. It can handle dynamic content and layouts, making it a powerful choice for generating PDFs that reflect the latest web standards.

Performance

  • html-pdf:

    html-pdf is generally fast for simple HTML to PDF conversions. However, performance may vary depending on the complexity of the HTML content and the resources available on the server.

  • pdfkit:

    pdfkit performs well for generating PDFs from scratch, but the performance can be impacted by the complexity of the document being created, especially with large files or numerous graphical elements.

  • puppeteer:

    puppeteer can be resource-intensive since it runs a headless browser. While it excels in rendering complex pages accurately, it may require more memory and processing power, making it less suitable for high-volume PDF generation.

Use Cases

  • html-pdf:

    html-pdf is best suited for applications that need to generate simple PDFs from existing HTML templates, such as invoices, reports, or basic documents where layout complexity is minimal.

  • pdfkit:

    pdfkit is ideal for generating custom PDFs where the content is created programmatically, such as reports, brochures, or any documents requiring specific layouts and designs.

  • puppeteer:

    puppeteer is perfect for scenarios where you need to capture the exact visual representation of a web page, such as generating PDFs from web applications, capturing screenshots, or creating print-ready documents.

Learning Curve

  • html-pdf:

    html-pdf has a low learning curve, making it easy for developers to get started with PDF generation without needing extensive knowledge of PDF specifications.

  • pdfkit:

    pdfkit has a moderate learning curve due to its flexibility and the need to understand the API for creating complex documents. It may require more time to master compared to simpler libraries.

  • puppeteer:

    puppeteer has a steeper learning curve, especially for developers unfamiliar with browser automation. Understanding how to manipulate the browser context and handle asynchronous operations is essential.

How to Choose: html-pdf vs pdfkit vs puppeteer

  • html-pdf:

    Choose html-pdf if you need a straightforward solution for converting HTML content into PDF documents. It is ideal for simple use cases where you want to generate PDFs from existing HTML templates without extensive customization.

  • pdfkit:

    Select pdfkit if you require a library that allows for detailed control over the PDF creation process. It is suitable for generating PDFs from scratch, enabling complex layouts, and adding graphics, text, and images programmatically. This is a great choice for applications needing custom PDF generation features.

  • puppeteer:

    Opt for puppeteer if you need a powerful tool that can render web pages as PDFs. Puppeteer is a headless browser automation library that provides extensive capabilities for web scraping, testing, and generating PDFs from dynamic web content. It is perfect for scenarios where you need to capture the exact appearance of a web page in PDF format.

README for html-pdf

node-html-pdf

HTML to PDF converter that uses phantomjs

image
Example Business Card
-> and its Source file

Example Receipt

Changelog

Have a look at the releases page: https://github.com/marcbachmann/node-html-pdf/releases

Installation

Install the html-pdf utility via npm:

$ npm install -g html-pdf

Command-line example

$ html-pdf test/businesscard.html businesscard.pdf

Code example

var fs = require('fs');
var pdf = require('html-pdf');
var html = fs.readFileSync('./test/businesscard.html', 'utf8');
var options = { format: 'Letter' };

pdf.create(html, options).toFile('./businesscard.pdf', function(err, res) {
  if (err) return console.log(err);
  console.log(res); // { filename: '/app/businesscard.pdf' }
});

API

var pdf = require('html-pdf');
pdf.create(html).toFile([filepath, ]function(err, res){
  console.log(res.filename);
});

pdf.create(html).toStream(function(err, stream){
  stream.pipe(fs.createWriteStream('./foo.pdf'));
});

pdf.create(html).toBuffer(function(err, buffer){
  console.log('This is a buffer:', Buffer.isBuffer(buffer));
});


// for backwards compatibility
// alias to pdf.create(html[, options]).toBuffer(callback)
pdf.create(html [, options], function(err, buffer){});

Footers and Headers

html-pdf can read the header or footer either out of the footer and header config object or out of the html source. You can either set a default header & footer or overwrite that by appending a page number (1 based index) to the id="pageHeader" attribute of a html tag.

You can use any combination of those tags. The library tries to find any element, that contains the pageHeader or pageFooter id prefix.

<div id="pageHeader">Default header</div>
<div id="pageHeader-first">Header on first page</div>
<div id="pageHeader-2">Header on second page</div>
<div id="pageHeader-3">Header on third page</div>
<div id="pageHeader-last">Header on last page</div>
...
<div id="pageFooter">Default footer</div>
<div id="pageFooter-first">Footer on first page</div>
<div id="pageFooter-2">Footer on second page</div>
<div id="pageFooter-last">Footer on last page</div>

Options

config = {

  // Export options
  "directory": "/tmp",       // The directory the file gets written into if not using .toFile(filename, callback). default: '/tmp'

  // Papersize Options: http://phantomjs.org/api/webpage/property/paper-size.html
  "height": "10.5in",        // allowed units: mm, cm, in, px
  "width": "8in",            // allowed units: mm, cm, in, px
  - or -
  "format": "Letter",        // allowed units: A3, A4, A5, Legal, Letter, Tabloid
  "orientation": "portrait", // portrait or landscape

  // Page options
  "border": "0",             // default is 0, units: mm, cm, in, px
  - or -
  "border": {
    "top": "2in",            // default is 0, units: mm, cm, in, px
    "right": "1in",
    "bottom": "2in",
    "left": "1.5in"
  },

  paginationOffset: 1,       // Override the initial pagination number
  "header": {
    "height": "45mm",
    "contents": '<div style="text-align: center;">Author: Marc Bachmann</div>'
  },
  "footer": {
    "height": "28mm",
    "contents": {
      first: 'Cover page',
      2: 'Second page', // Any page number is working. 1-based index
      default: '<span style="color: #444;">{{page}}</span>/<span>{{pages}}</span>', // fallback value
      last: 'Last Page'
    }
  },


  // Rendering options
  "base": "file:///home/www/your-asset-path/", // Base path that's used to load files (images, css, js) when they aren't referenced using a host

  // Zooming option, can be used to scale images if `options.type` is not pdf
  "zoomFactor": "1", // default is 1

  // File options
  "type": "pdf",           // allowed file types: png, jpeg, pdf
  "quality": "75",         // only used for types png & jpeg

  // Script options
  "phantomPath": "./node_modules/phantomjs/bin/phantomjs", // PhantomJS binary which should get downloaded automatically
  "phantomArgs": [], // array of strings used as phantomjs args e.g. ["--ignore-ssl-errors=yes"]
  "localUrlAccess": false, // Prevent local file:// access by passing '--local-url-access=false' to phantomjs
                           // For security reasons you should keep the default value if you render arbritary html/js.
  "script": '/url',        // Absolute path to a custom phantomjs script, use the file in lib/scripts as example
  "timeout": 30000,        // Timeout that will cancel phantomjs, in milliseconds

  // Time we should wait after window load
  // accepted values are 'manual', some delay in milliseconds or undefined to wait for a render event
  "renderDelay": 1000,

  // HTTP Headers that are used for requests
  "httpHeaders": {
    // e.g.
    "Authorization": "Bearer ACEFAD8C-4B4D-4042-AB30-6C735F5BAC8B"
  },

  // To run Node application as Windows service
  "childProcessOptions": {
    "detached": true
  }

  // HTTP Cookies that are used for requests
  "httpCookies": [
    // e.g.
    {
      "name": "Valid-Cookie-Name", // required
      "value": "Valid-Cookie-Value", // required
      "domain": "localhost",
      "path": "/foo", // required
      "httponly": true,
      "secure": false,
      "expires": (new Date()).getTime() + (1000 * 60 * 60) // e.g. expires in 1 hour
    }
  ]

}

The full options object gets converted to JSON and will get passed to the phantomjs script as third argument.
There are more options concerning the paperSize, header & footer options inside the phantomjs script.