parse5 vs cheerio vs domino vs jsdom
Server-Side HTML Parsing and DOM Manipulation in Node.js
parse5cheeriodominojsdomSimilar Packages:

Server-Side HTML Parsing and DOM Manipulation in Node.js

cheerio, domino, jsdom, and parse5 are essential tools for handling HTML and DOM structures within Node.js environments, each serving distinct architectural needs. cheerio provides a jQuery-like API for fast HTML manipulation without a full DOM, making it ideal for scraping. jsdom implements the full web DOM standards, allowing for browser-like behavior including script execution, which is crucial for testing. parse5 is a low-level, high-performance HTML parser and serializer that works with ASTs rather than a mutable DOM. domino offers a fast, server-side DOM implementation focused on speed for rendering tasks, though with less active maintenance than jsdom.

Npm Package Weekly Downloads Trend

3 Years

Github Stars Ranking

Stat Detail

Package
Downloads
Stars
Size
Issues
Publish
License
parse546,161,3983,884337 kB358 months agoMIT
cheerio030,2271.01 MB392 months agoMIT
domino0788-376 years agoBSD-2-Clause
jsdom021,5266.93 MB4154 days agoMIT

Server-Side HTML Parsing and DOM Manipulation in Node.js

When working on the server side, handling HTML isn't as straightforward as in the browser. You don't have a built-in document object. Instead, you rely on libraries like cheerio, domino, jsdom, and parse5. Each solves the problem differently, trading off speed, standards compliance, and API familiarity. Let's break down how they handle common engineering tasks.

πŸ—οΈ Core Architecture: DOM vs. AST vs. jQuery-like

jsdom builds a full, living DOM tree that mimics the browser.

  • It implements W3C standards closely.
  • You get window, document, and even event listeners.
  • Heavy on memory but accurate.
// jsdom: Full DOM environment
const { JSDOM } = require("jsdom");
const dom = new JSDOM(`<!DOCTYPE html><p>Hello</p>`);
console.log(dom.window.document.querySelector("p").textContent); // "Hello"

cheerio loads HTML into a custom data structure manipulated via a jQuery-like API.

  • It is not a real DOM; it's a specialized tree for traversal.
  • No window object, no layout engine.
  • Very fast for selection and modification.
// cheerio: jQuery-like API
const cheerio = require("cheerio");
const $ = cheerio.load(`<p>Hello</p>`);
console.log($("p").text()); // "Hello"

domino implements a subset of the DOM optimized for server-side speed.

  • It creates a document object but skips browser-specific features like layout.
  • Faster than jsdom but less compliant with modern standards.
// domino: Lightweight DOM
const domino = require("domino");
const doc = domino.createDocument(`<p>Hello</p>`);
console.log(doc.querySelector("p").textContent); // "Hello"

parse5 parses HTML into an Abstract Syntax Tree (AST).

  • It does not provide a DOM API (no querySelector).
  • You traverse the AST nodes manually.
  • Best for parsing/serializing, not interactive manipulation.
// parse5: AST Parsing
const parse5 = require("parse5");
const ast = parse5.parse(`<p>Hello</p>`);
// You must traverse ast.childNodes manually to find the <p> tag

⚑ Performance and Overhead

Speed often dictates choice in server environments where concurrency matters.

parse5 is generally the fastest for pure parsing.

  • Since it doesn't build a mutable DOM or attach methods to nodes, it has minimal overhead.
  • Use this for static analysis or conversion tasks.
// parse5: Fast serialization
const html = parse5.serialize(ast);

cheerio balances speed with developer convenience.

  • It is faster than jsdom because it skips DOM compliance checks.
  • Ideal for high-volume scraping where you need to extract data quickly.
// cheerio: Fast selection
const items = $(".item").map((i, el) => $(el).text()).get();

domino is optimized for SSR scenarios.

  • It was designed to be faster than jsdom for rendering HTML strings.
  • However, it lacks the optimization ecosystem of cheerio.
// domino: SSR manipulation
const div = doc.createElement("div");
div.innerHTML = "<span>Content</span>";

jsdom is the heaviest option.

  • It simulates browser quirks, which costs CPU and memory.
  • Only use it when you truly need browser fidelity.
// jsdom: Heavy initialization
const dom = new JSDOM(`...`, { runScripts: "dangerously" });

πŸ› οΈ API Familiarity and Developer Experience

How easily can your team work with the library?

cheerio wins on familiarity for frontend devs.

  • If you know jQuery, you know cheerio.
  • Chainable methods make code concise.
// cheerio: Chainable
$("ul").find("li").addClass("active").text("New Text");

jsdom feels like coding in the browser.

  • Standard DOM APIs (getElementById, addEventListener).
  • Great for testing frameworks like Jest or Mocha.
// jsdom: Standard DOM
const el = dom.window.document.getElementById("app");
el.addEventListener("click", () => {});

domino uses standard DOM APIs but with limitations.

  • Similar to jsdom but some newer APIs might be missing.
  • Good for teams needing a document object without the jsdom weight.
// domino: Standard DOM subset
const el = doc.getElementById("app");

parse5 requires low-level AST manipulation.

  • Steeper learning curve; no helper methods for selection.
  • Best for tooling authors, not general app developers.
// parse5: Manual traversal
function findNode(node, tagName) {
  if (node.tagName === tagName) return node;
  if (!node.childNodes) return null;
  for (const child of node.childNodes) {
    const found = findNode(child, tagName);
    if (found) return found;
  }
}

🌐 JavaScript Execution and Browser Features

Do you need to run scripts inside the HTML?

jsdom supports script execution.

  • You can run <script> tags within the loaded HTML.
  • Essential for testing apps that rely on client-side JS initialization.
// jsdom: Run scripts
const dom = new JSDOM(`...`, { runScripts: "dangerously" });
// Scripts in HTML will execute

cheerio does not execute scripts.

  • It treats <script> tags as plain text or ignores them.
  • Safer for scraping untrusted content.
// cheerio: No script execution
// <script>alert('xss')</script> is just text

domino does not execute scripts.

  • Focused purely on DOM structure for rendering.
  • Prevents side effects during server rendering.
// domino: No script execution
// Safe for SSR pipelines

parse5 does not execute scripts.

  • It is a parser, not an environment.
  • You get the raw structure, nothing runs.
// parse5: Static parse
// No runtime environment exists

πŸ§ͺ Testing and Standards Compliance

For unit tests, accuracy matters more than raw speed.

jsdom is the industry standard for testing.

  • Used by default in Jest and many other test runners.
  • Closely tracks browser behavior.
// jsdom: Testing
expect(dom.window.document.querySelector(".btn")).toBeTruthy();

cheerio is used for logic testing.

  • Good for testing scraping logic or HTML generation.
  • Not suitable for testing browser interactions.
// cheerio: Output testing
expect($(".error").length).toBe(1);

domino is rarely used for testing now.

  • Historically used for Angular Universal SSR tests.
  • jsdom is generally preferred for new test suites.
// domino: Legacy testing
// Mostly replaced by jsdom in modern stacks

parse5 is used for compiler testing.

  • Validates HTML structure compliance.
  • Used in linters like HTMLHint or build plugins.
// parse5: Structure validation
// Check AST for specific node types

🀝 Similarities: Shared Ground

Despite their differences, these libraries share common goals.

1. πŸ“„ HTML Input Handling

  • All accept HTML strings as input.
  • Handle malformed HTML gracefully (to varying degrees).
// All handle broken tags
const html = "<div><p>Unclosed";
// jsdom, cheerio, domino, parse5 will all parse this without crashing

2. πŸ“€ Serialization

  • All can turn their internal representation back into an HTML string.
  • Useful for modifying HTML and saving it.
// jsdom
const out1 = dom.serialize();
// cheerio
const out2 = $.html();
// domino
const out3 = doc.documentElement.outerHTML;
// parse5
const out4 = parse5.serialize(ast);

3. πŸ”’ Security Considerations

  • None execute scripts by default (except jsdom with config).
  • Important when processing user-generated content.
// All: Sanitization needed for XSS
// You must sanitize inputs before injecting into DOM/AST

πŸ“Š Summary: Key Differences

Featurecheeriodominojsdomparse5
API StylejQuery-likeStandard DOMStandard DOMAST Methods
Script Execution❌ No❌ Noβœ… Yes (configurable)❌ No
Speed⚑ Very Fast⚑ Fast🐒 SlowerπŸš€ Fastest
Primary UseScrapingSSR (Legacy)TestingTooling/Build
Memory UsageLowLowHighVery Low

πŸ’‘ The Big Picture

cheerio is your go-to for scraping πŸ•·οΈ. It's fast, familiar, and gets the job done for data extraction without the bloat of a browser simulation.

jsdom is your browser-in-a-box πŸ“¦. Use it for testing frontend code or when you need to validate how HTML behaves with scripts and events.

parse5 is the engine under the hood βš™οΈ. Use it when you are building tools that need to process HTML at scale and don't need a mutable DOM.

domino is a specialized tool for SSR πŸ–₯️. While fast, ensure it meets your standards compliance needs before adopting it for new architectures, as jsdom or custom solutions often replace it in modern stacks.

Final Thought: Don't reach for jsdom unless you need browser fidelity. For most server-side HTML tasks, cheerio or parse5 will save you resources and complexity.

How to Choose: parse5 vs cheerio vs domino vs jsdom

  • parse5:

    Choose parse5 when performance is the absolute priority and you only need to parse or serialize HTML without manipulating a live DOM tree. It is ideal for build tools, linters, or transformers that need to process HTML streams efficiently without the memory cost of a full DOM implementation.

  • cheerio:

    Choose cheerio when you need to scrape or manipulate static HTML quickly without the overhead of a full browser environment. It is the best fit for data extraction tasks where you want a familiar jQuery-style API and do not need to execute JavaScript or handle complex DOM events.

  • domino:

    Choose domino primarily for legacy server-side rendering (SSR) pipelines where a lightweight DOM is required for structural manipulation without script execution. However, verify its maintenance status for new projects, as it is often outpaced by jsdom in terms of standards compliance and community support.

  • jsdom:

    Choose jsdom when you need a near-complete browser environment in Node.js, such as for unit testing frontend components that interact with the DOM or when you need to simulate browser behavior. It is the standard choice for integration tests requiring window, document, and navigator objects.

README for parse5

parse5

parse5

HTML parser and serializer.

npm install --save parse5

πŸ“– Documentation πŸ“–


List of parse5 toolset packages

GitHub

Online playground

Changelog