cheerio, domino, jsdom, and parse5 are essential tools for handling HTML and DOM structures within Node.js environments, each serving distinct architectural needs. cheerio provides a jQuery-like API for fast HTML manipulation without a full DOM, making it ideal for scraping. jsdom implements the full web DOM standards, allowing for browser-like behavior including script execution, which is crucial for testing. parse5 is a low-level, high-performance HTML parser and serializer that works with ASTs rather than a mutable DOM. domino offers a fast, server-side DOM implementation focused on speed for rendering tasks, though with less active maintenance than jsdom.
When working on the server side, handling HTML isn't as straightforward as in the browser. You don't have a built-in document object. Instead, you rely on libraries like cheerio, domino, jsdom, and parse5. Each solves the problem differently, trading off speed, standards compliance, and API familiarity. Let's break down how they handle common engineering tasks.
jsdom builds a full, living DOM tree that mimics the browser.
window, document, and even event listeners.// jsdom: Full DOM environment
const { JSDOM } = require("jsdom");
const dom = new JSDOM(`<!DOCTYPE html><p>Hello</p>`);
console.log(dom.window.document.querySelector("p").textContent); // "Hello"
cheerio loads HTML into a custom data structure manipulated via a jQuery-like API.
window object, no layout engine.// cheerio: jQuery-like API
const cheerio = require("cheerio");
const $ = cheerio.load(`<p>Hello</p>`);
console.log($("p").text()); // "Hello"
domino implements a subset of the DOM optimized for server-side speed.
document object but skips browser-specific features like layout.jsdom but less compliant with modern standards.// domino: Lightweight DOM
const domino = require("domino");
const doc = domino.createDocument(`<p>Hello</p>`);
console.log(doc.querySelector("p").textContent); // "Hello"
parse5 parses HTML into an Abstract Syntax Tree (AST).
querySelector).// parse5: AST Parsing
const parse5 = require("parse5");
const ast = parse5.parse(`<p>Hello</p>`);
// You must traverse ast.childNodes manually to find the <p> tag
Speed often dictates choice in server environments where concurrency matters.
parse5 is generally the fastest for pure parsing.
// parse5: Fast serialization
const html = parse5.serialize(ast);
cheerio balances speed with developer convenience.
jsdom because it skips DOM compliance checks.// cheerio: Fast selection
const items = $(".item").map((i, el) => $(el).text()).get();
domino is optimized for SSR scenarios.
jsdom for rendering HTML strings.cheerio.// domino: SSR manipulation
const div = doc.createElement("div");
div.innerHTML = "<span>Content</span>";
jsdom is the heaviest option.
// jsdom: Heavy initialization
const dom = new JSDOM(`...`, { runScripts: "dangerously" });
How easily can your team work with the library?
cheerio wins on familiarity for frontend devs.
cheerio.// cheerio: Chainable
$("ul").find("li").addClass("active").text("New Text");
jsdom feels like coding in the browser.
getElementById, addEventListener).// jsdom: Standard DOM
const el = dom.window.document.getElementById("app");
el.addEventListener("click", () => {});
domino uses standard DOM APIs but with limitations.
jsdom but some newer APIs might be missing.document object without the jsdom weight.// domino: Standard DOM subset
const el = doc.getElementById("app");
parse5 requires low-level AST manipulation.
// parse5: Manual traversal
function findNode(node, tagName) {
if (node.tagName === tagName) return node;
if (!node.childNodes) return null;
for (const child of node.childNodes) {
const found = findNode(child, tagName);
if (found) return found;
}
}
Do you need to run scripts inside the HTML?
jsdom supports script execution.
<script> tags within the loaded HTML.// jsdom: Run scripts
const dom = new JSDOM(`...`, { runScripts: "dangerously" });
// Scripts in HTML will execute
cheerio does not execute scripts.
<script> tags as plain text or ignores them.// cheerio: No script execution
// <script>alert('xss')</script> is just text
domino does not execute scripts.
// domino: No script execution
// Safe for SSR pipelines
parse5 does not execute scripts.
// parse5: Static parse
// No runtime environment exists
For unit tests, accuracy matters more than raw speed.
jsdom is the industry standard for testing.
// jsdom: Testing
expect(dom.window.document.querySelector(".btn")).toBeTruthy();
cheerio is used for logic testing.
// cheerio: Output testing
expect($(".error").length).toBe(1);
domino is rarely used for testing now.
jsdom is generally preferred for new test suites.// domino: Legacy testing
// Mostly replaced by jsdom in modern stacks
parse5 is used for compiler testing.
// parse5: Structure validation
// Check AST for specific node types
Despite their differences, these libraries share common goals.
// All handle broken tags
const html = "<div><p>Unclosed";
// jsdom, cheerio, domino, parse5 will all parse this without crashing
// jsdom
const out1 = dom.serialize();
// cheerio
const out2 = $.html();
// domino
const out3 = doc.documentElement.outerHTML;
// parse5
const out4 = parse5.serialize(ast);
jsdom with config).// All: Sanitization needed for XSS
// You must sanitize inputs before injecting into DOM/AST
| Feature | cheerio | domino | jsdom | parse5 |
|---|---|---|---|---|
| API Style | jQuery-like | Standard DOM | Standard DOM | AST Methods |
| Script Execution | β No | β No | β Yes (configurable) | β No |
| Speed | β‘ Very Fast | β‘ Fast | π’ Slower | π Fastest |
| Primary Use | Scraping | SSR (Legacy) | Testing | Tooling/Build |
| Memory Usage | Low | Low | High | Very Low |
cheerio is your go-to for scraping π·οΈ. It's fast, familiar, and gets the job done for data extraction without the bloat of a browser simulation.
jsdom is your browser-in-a-box π¦. Use it for testing frontend code or when you need to validate how HTML behaves with scripts and events.
parse5 is the engine under the hood βοΈ. Use it when you are building tools that need to process HTML at scale and don't need a mutable DOM.
domino is a specialized tool for SSR π₯οΈ. While fast, ensure it meets your standards compliance needs before adopting it for new architectures, as jsdom or custom solutions often replace it in modern stacks.
Final Thought: Don't reach for jsdom unless you need browser fidelity. For most server-side HTML tasks, cheerio or parse5 will save you resources and complexity.
Choose cheerio when you need to scrape or manipulate static HTML quickly without the overhead of a full browser environment. It is the best fit for data extraction tasks where you want a familiar jQuery-style API and do not need to execute JavaScript or handle complex DOM events.
Choose domino primarily for legacy server-side rendering (SSR) pipelines where a lightweight DOM is required for structural manipulation without script execution. However, verify its maintenance status for new projects, as it is often outpaced by jsdom in terms of standards compliance and community support.
Choose jsdom when you need a near-complete browser environment in Node.js, such as for unit testing frontend components that interact with the DOM or when you need to simulate browser behavior. It is the standard choice for integration tests requiring window, document, and navigator objects.
Choose parse5 when performance is the absolute priority and you only need to parse or serialize HTML without manipulating a live DOM tree. It is ideal for build tools, linters, or transformers that need to process HTML streams efficiently without the memory cost of a full DOM implementation.
import * as cheerio from 'cheerio';
const $ = cheerio.load('<h2 class="title">Hello world</h2>');
$('h2.title').text('Hello there!');
$('h2').addClass('welcome');
$.html();
//=> <html><head></head><body><h2 class="title welcome">Hello there!</h2></body></html>
Install Cheerio using a package manager like npm, yarn, or bun.
npm install cheerio
# or
bun add cheerio
β€ Proven syntax: Cheerio implements a subset of core jQuery. Cheerio removes all the DOM inconsistencies and browser cruft from the jQuery library, revealing its truly gorgeous API.
Ο Blazingly fast: Cheerio works with a very simple, consistent DOM model. As a result parsing, manipulating, and rendering are incredibly efficient.
β Incredibly flexible: Cheerio wraps around parse5 for parsing HTML and can optionally use the forgiving htmlparser2. Cheerio can parse nearly any HTML or XML document. Cheerio works in both browser and server environments.
First you need to load in the HTML. This step in jQuery is implicit, since jQuery operates on the one, baked-in DOM. With Cheerio, we need to pass in the HTML document.
// ESM or TypeScript:
import * as cheerio from 'cheerio';
// In other environments:
const cheerio = require('cheerio');
const $ = cheerio.load('<ul id="fruits">...</ul>');
$.html();
//=> <html><head></head><body><ul id="fruits">...</ul></body></html>
Once you've loaded the HTML, you can use jQuery-style selectors to find elements within the document.
selector searches within the context scope which searches within the root
scope. selector and context can be a string expression, DOM Element, array
of DOM elements, or cheerio object. root, if provided, is typically the HTML
document string.
This selector method is the starting point for traversing and manipulating the document. Like in jQuery, it's the primary method for selecting elements in the document.
$('.apple', '#fruits').text();
//=> Apple
$('ul .pear').attr('class');
//=> pear
$('li[class=orange]').html();
//=> Orange
When you're ready to render the document, you can call the html method on the
"root" selection:
$.root().html();
//=> <html>
// <head></head>
// <body>
// <ul id="fruits">
// <li class="apple">Apple</li>
// <li class="orange">Orange</li>
// <li class="pear">Pear</li>
// </ul>
// </body>
// </html>
If you want to render the
outerHTML
of a selection, you can use the outerHTML prop:
$('.pear').prop('outerHTML');
//=> <li class="pear">Pear</li>
You may also render the text content of a Cheerio object using the text
method:
const $ = cheerio.load('This is <em>content</em>.');
$('body').text();
//=> This is content.
Cheerio collections are made up of objects that bear some resemblance to browser-based DOM nodes. You can expect them to define the following properties:
tagNameparentNodepreviousSiblingnextSiblingnodeValuefirstChildchildNodeslastChildThis video tutorial is a follow-up to Nettut's "How to Scrape Web Pages with Node.js and jQuery", using cheerio instead of JSDOM + jQuery. This video shows how easy it is to use cheerio and how much faster cheerio is than JSDOM + jQuery.
Are you using cheerio in production? Add it to the wiki!
Does your company use Cheerio in production? Please consider sponsoring this project! Your help will allow maintainers to dedicate more time and resources to its development and support.
Headlining Sponsors
Other Sponsors
Become a backer to show your support for Cheerio and help us maintain and improve this open source project.
MIT