cheerio vs domino vs jsdom vs parse5
Server-Side HTML Parsing and DOM Manipulation in Node.js
cheeriodominojsdomparse5Similar Packages:

Server-Side HTML Parsing and DOM Manipulation in Node.js

cheerio, domino, jsdom, and parse5 are essential tools for handling HTML and DOM structures within Node.js environments, each serving distinct architectural needs. cheerio provides a jQuery-like API for fast HTML manipulation without a full DOM, making it ideal for scraping. jsdom implements the full web DOM standards, allowing for browser-like behavior including script execution, which is crucial for testing. parse5 is a low-level, high-performance HTML parser and serializer that works with ASTs rather than a mutable DOM. domino offers a fast, server-side DOM implementation focused on speed for rendering tasks, though with less active maintenance than jsdom.

Npm Package Weekly Downloads Trend

3 Years

Github Stars Ranking

Stat Detail

Package
Downloads
Stars
Size
Issues
Publish
License
cheerio16,306,30530,1821.01 MB342 months agoMIT
domino857,326788-376 years agoBSD-2-Clause
jsdom021,5243.4 MB422a month agoMIT
parse503,882337 kB358 months agoMIT

Server-Side HTML Parsing and DOM Manipulation in Node.js

When working on the server side, handling HTML isn't as straightforward as in the browser. You don't have a built-in document object. Instead, you rely on libraries like cheerio, domino, jsdom, and parse5. Each solves the problem differently, trading off speed, standards compliance, and API familiarity. Let's break down how they handle common engineering tasks.

πŸ—οΈ Core Architecture: DOM vs. AST vs. jQuery-like

jsdom builds a full, living DOM tree that mimics the browser.

  • It implements W3C standards closely.
  • You get window, document, and even event listeners.
  • Heavy on memory but accurate.
// jsdom: Full DOM environment
const { JSDOM } = require("jsdom");
const dom = new JSDOM(`<!DOCTYPE html><p>Hello</p>`);
console.log(dom.window.document.querySelector("p").textContent); // "Hello"

cheerio loads HTML into a custom data structure manipulated via a jQuery-like API.

  • It is not a real DOM; it's a specialized tree for traversal.
  • No window object, no layout engine.
  • Very fast for selection and modification.
// cheerio: jQuery-like API
const cheerio = require("cheerio");
const $ = cheerio.load(`<p>Hello</p>`);
console.log($("p").text()); // "Hello"

domino implements a subset of the DOM optimized for server-side speed.

  • It creates a document object but skips browser-specific features like layout.
  • Faster than jsdom but less compliant with modern standards.
// domino: Lightweight DOM
const domino = require("domino");
const doc = domino.createDocument(`<p>Hello</p>`);
console.log(doc.querySelector("p").textContent); // "Hello"

parse5 parses HTML into an Abstract Syntax Tree (AST).

  • It does not provide a DOM API (no querySelector).
  • You traverse the AST nodes manually.
  • Best for parsing/serializing, not interactive manipulation.
// parse5: AST Parsing
const parse5 = require("parse5");
const ast = parse5.parse(`<p>Hello</p>`);
// You must traverse ast.childNodes manually to find the <p> tag

⚑ Performance and Overhead

Speed often dictates choice in server environments where concurrency matters.

parse5 is generally the fastest for pure parsing.

  • Since it doesn't build a mutable DOM or attach methods to nodes, it has minimal overhead.
  • Use this for static analysis or conversion tasks.
// parse5: Fast serialization
const html = parse5.serialize(ast);

cheerio balances speed with developer convenience.

  • It is faster than jsdom because it skips DOM compliance checks.
  • Ideal for high-volume scraping where you need to extract data quickly.
// cheerio: Fast selection
const items = $(".item").map((i, el) => $(el).text()).get();

domino is optimized for SSR scenarios.

  • It was designed to be faster than jsdom for rendering HTML strings.
  • However, it lacks the optimization ecosystem of cheerio.
// domino: SSR manipulation
const div = doc.createElement("div");
div.innerHTML = "<span>Content</span>";

jsdom is the heaviest option.

  • It simulates browser quirks, which costs CPU and memory.
  • Only use it when you truly need browser fidelity.
// jsdom: Heavy initialization
const dom = new JSDOM(`...`, { runScripts: "dangerously" });

πŸ› οΈ API Familiarity and Developer Experience

How easily can your team work with the library?

cheerio wins on familiarity for frontend devs.

  • If you know jQuery, you know cheerio.
  • Chainable methods make code concise.
// cheerio: Chainable
$("ul").find("li").addClass("active").text("New Text");

jsdom feels like coding in the browser.

  • Standard DOM APIs (getElementById, addEventListener).
  • Great for testing frameworks like Jest or Mocha.
// jsdom: Standard DOM
const el = dom.window.document.getElementById("app");
el.addEventListener("click", () => {});

domino uses standard DOM APIs but with limitations.

  • Similar to jsdom but some newer APIs might be missing.
  • Good for teams needing a document object without the jsdom weight.
// domino: Standard DOM subset
const el = doc.getElementById("app");

parse5 requires low-level AST manipulation.

  • Steeper learning curve; no helper methods for selection.
  • Best for tooling authors, not general app developers.
// parse5: Manual traversal
function findNode(node, tagName) {
  if (node.tagName === tagName) return node;
  if (!node.childNodes) return null;
  for (const child of node.childNodes) {
    const found = findNode(child, tagName);
    if (found) return found;
  }
}

🌐 JavaScript Execution and Browser Features

Do you need to run scripts inside the HTML?

jsdom supports script execution.

  • You can run <script> tags within the loaded HTML.
  • Essential for testing apps that rely on client-side JS initialization.
// jsdom: Run scripts
const dom = new JSDOM(`...`, { runScripts: "dangerously" });
// Scripts in HTML will execute

cheerio does not execute scripts.

  • It treats <script> tags as plain text or ignores them.
  • Safer for scraping untrusted content.
// cheerio: No script execution
// <script>alert('xss')</script> is just text

domino does not execute scripts.

  • Focused purely on DOM structure for rendering.
  • Prevents side effects during server rendering.
// domino: No script execution
// Safe for SSR pipelines

parse5 does not execute scripts.

  • It is a parser, not an environment.
  • You get the raw structure, nothing runs.
// parse5: Static parse
// No runtime environment exists

πŸ§ͺ Testing and Standards Compliance

For unit tests, accuracy matters more than raw speed.

jsdom is the industry standard for testing.

  • Used by default in Jest and many other test runners.
  • Closely tracks browser behavior.
// jsdom: Testing
expect(dom.window.document.querySelector(".btn")).toBeTruthy();

cheerio is used for logic testing.

  • Good for testing scraping logic or HTML generation.
  • Not suitable for testing browser interactions.
// cheerio: Output testing
expect($(".error").length).toBe(1);

domino is rarely used for testing now.

  • Historically used for Angular Universal SSR tests.
  • jsdom is generally preferred for new test suites.
// domino: Legacy testing
// Mostly replaced by jsdom in modern stacks

parse5 is used for compiler testing.

  • Validates HTML structure compliance.
  • Used in linters like HTMLHint or build plugins.
// parse5: Structure validation
// Check AST for specific node types

🀝 Similarities: Shared Ground

Despite their differences, these libraries share common goals.

1. πŸ“„ HTML Input Handling

  • All accept HTML strings as input.
  • Handle malformed HTML gracefully (to varying degrees).
// All handle broken tags
const html = "<div><p>Unclosed";
// jsdom, cheerio, domino, parse5 will all parse this without crashing

2. πŸ“€ Serialization

  • All can turn their internal representation back into an HTML string.
  • Useful for modifying HTML and saving it.
// jsdom
const out1 = dom.serialize();
// cheerio
const out2 = $.html();
// domino
const out3 = doc.documentElement.outerHTML;
// parse5
const out4 = parse5.serialize(ast);

3. πŸ”’ Security Considerations

  • None execute scripts by default (except jsdom with config).
  • Important when processing user-generated content.
// All: Sanitization needed for XSS
// You must sanitize inputs before injecting into DOM/AST

πŸ“Š Summary: Key Differences

Featurecheeriodominojsdomparse5
API StylejQuery-likeStandard DOMStandard DOMAST Methods
Script Execution❌ No❌ Noβœ… Yes (configurable)❌ No
Speed⚑ Very Fast⚑ Fast🐒 SlowerπŸš€ Fastest
Primary UseScrapingSSR (Legacy)TestingTooling/Build
Memory UsageLowLowHighVery Low

πŸ’‘ The Big Picture

cheerio is your go-to for scraping πŸ•·οΈ. It's fast, familiar, and gets the job done for data extraction without the bloat of a browser simulation.

jsdom is your browser-in-a-box πŸ“¦. Use it for testing frontend code or when you need to validate how HTML behaves with scripts and events.

parse5 is the engine under the hood βš™οΈ. Use it when you are building tools that need to process HTML at scale and don't need a mutable DOM.

domino is a specialized tool for SSR πŸ–₯️. While fast, ensure it meets your standards compliance needs before adopting it for new architectures, as jsdom or custom solutions often replace it in modern stacks.

Final Thought: Don't reach for jsdom unless you need browser fidelity. For most server-side HTML tasks, cheerio or parse5 will save you resources and complexity.

How to Choose: cheerio vs domino vs jsdom vs parse5

  • cheerio:

    Choose cheerio when you need to scrape or manipulate static HTML quickly without the overhead of a full browser environment. It is the best fit for data extraction tasks where you want a familiar jQuery-style API and do not need to execute JavaScript or handle complex DOM events.

  • domino:

    Choose domino primarily for legacy server-side rendering (SSR) pipelines where a lightweight DOM is required for structural manipulation without script execution. However, verify its maintenance status for new projects, as it is often outpaced by jsdom in terms of standards compliance and community support.

  • jsdom:

    Choose jsdom when you need a near-complete browser environment in Node.js, such as for unit testing frontend components that interact with the DOM or when you need to simulate browser behavior. It is the standard choice for integration tests requiring window, document, and navigator objects.

  • parse5:

    Choose parse5 when performance is the absolute priority and you only need to parse or serialize HTML without manipulating a live DOM tree. It is ideal for build tools, linters, or transformers that need to process HTML streams efficiently without the memory cost of a full DOM implementation.

README for cheerio

cheerio

The fast, flexible, and elegant library for parsing and manipulating HTML and XML.

δΈ­ζ–‡ζ–‡ζ‘£ (Chinese Readme)

import * as cheerio from 'cheerio';
const $ = cheerio.load('<h2 class="title">Hello world</h2>');

$('h2.title').text('Hello there!');
$('h2').addClass('welcome');

$.html();
//=> <html><head></head><body><h2 class="title welcome">Hello there!</h2></body></html>

Installation

Install Cheerio using a package manager like npm, yarn, or bun.

npm install cheerio
# or
bun add cheerio

Features

❀ Proven syntax: Cheerio implements a subset of core jQuery. Cheerio removes all the DOM inconsistencies and browser cruft from the jQuery library, revealing its truly gorgeous API.

ϟ Blazingly fast: Cheerio works with a very simple, consistent DOM model. As a result parsing, manipulating, and rendering are incredibly efficient.

❁ Incredibly flexible: Cheerio wraps around parse5 for parsing HTML and can optionally use the forgiving htmlparser2. Cheerio can parse nearly any HTML or XML document. Cheerio works in both browser and server environments.

API

Loading

First you need to load in the HTML. This step in jQuery is implicit, since jQuery operates on the one, baked-in DOM. With Cheerio, we need to pass in the HTML document.

// ESM or TypeScript:
import * as cheerio from 'cheerio';

// In other environments:
const cheerio = require('cheerio');

const $ = cheerio.load('<ul id="fruits">...</ul>');

$.html();
//=> <html><head></head><body><ul id="fruits">...</ul></body></html>

Selectors

Once you've loaded the HTML, you can use jQuery-style selectors to find elements within the document.

$( selector, [context], [root] )

selector searches within the context scope which searches within the root scope. selector and context can be a string expression, DOM Element, array of DOM elements, or cheerio object. root, if provided, is typically the HTML document string.

This selector method is the starting point for traversing and manipulating the document. Like in jQuery, it's the primary method for selecting elements in the document.

$('.apple', '#fruits').text();
//=> Apple

$('ul .pear').attr('class');
//=> pear

$('li[class=orange]').html();
//=> Orange

Rendering

When you're ready to render the document, you can call the html method on the "root" selection:

$.root().html();
//=>  <html>
//      <head></head>
//      <body>
//        <ul id="fruits">
//          <li class="apple">Apple</li>
//          <li class="orange">Orange</li>
//          <li class="pear">Pear</li>
//        </ul>
//      </body>
//    </html>

If you want to render the outerHTML of a selection, you can use the outerHTML prop:

$('.pear').prop('outerHTML');
//=> <li class="pear">Pear</li>

You may also render the text content of a Cheerio object using the text method:

const $ = cheerio.load('This is <em>content</em>.');
$('body').text();
//=> This is content.

The "DOM Node" object

Cheerio collections are made up of objects that bear some resemblance to browser-based DOM nodes. You can expect them to define the following properties:

  • tagName
  • parentNode
  • previousSibling
  • nextSibling
  • nodeValue
  • firstChild
  • childNodes
  • lastChild

Screencasts

https://vimeo.com/31950192

This video tutorial is a follow-up to Nettut's "How to Scrape Web Pages with Node.js and jQuery", using cheerio instead of JSDOM + jQuery. This video shows how easy it is to use cheerio and how much faster cheerio is than JSDOM + jQuery.

Cheerio in the real world

Are you using cheerio in production? Add it to the wiki!

Sponsors

Does your company use Cheerio in production? Please consider sponsoring this project! Your help will allow maintainers to dedicate more time and resources to its development and support.

Headlining Sponsors

Tidelift Github AirBnB HasData

Other Sponsors

OnlineCasinosSpelen Nieuwe-Casinos.net

Backers

Become a backer to show your support for Cheerio and help us maintain and improve this open source project.

Vasy Kafidoff

License

MIT