antlr4, jison, nearley, and pegjs are parser generator libraries that help developers define formal grammars and automatically generate parsers to process structured text (like domain-specific languages, configuration files, or custom query syntaxes). Each uses a different parsing strategy — ANTLR4 supports LL(*) with powerful tooling, Jison implements LALR(1) similar to Bison, Nearley uses Earley’s algorithm for ambiguous grammars, and PEG.js implements Parsing Expression Grammars which are unambiguous by design. These tools enable frontend developers to build robust language processors directly in JavaScript without hand-writing complex recursive descent parsers.
When you need to parse structured text in the browser or Node.js — whether it’s a custom query language, a configuration format, or a domain-specific DSL — writing a recursive descent parser by hand quickly becomes messy and error-prone. That’s where parser generators shine. But not all parser generators are built the same. Let’s break down how antlr4, jison, nearley, and pegjs differ in approach, trade-offs, and real-world suitability.
antlr4 uses LL(*) parsing — a top-down method that can look ahead arbitrarily far to decide what rule to apply. This makes it powerful for complex grammars, especially those with lots of optional clauses or nested structures. ANTLR also includes built-in support for error recovery, so your parser doesn’t just crash on bad input.
// ANTLR4 grammar snippet (in .g4 file)
expr: expr ('*' | '/') expr
| expr ('+' | '-') expr
| INT
;
jison is based on LALR(1) parsing — a bottom-up technique inherited from Unix tools like Yacc and Bison. It’s fast and deterministic but struggles with ambiguous grammars unless you manually resolve shift/reduce conflicts. You write grammar rules in a .jison file that closely resembles classic Yacc syntax.
/* Jison grammar */
%%
expressions: e EOF { return $1; };
e: e '+' e { $$ = $1 + $3; }
| NUMBER { $$ = Number(yytext); }
;
nearley uses Earley’s algorithm, which is a general-purpose chart parser capable of handling any context-free grammar, including left-recursive and ambiguous ones. If multiple valid parses exist, Nearley returns all of them — useful for natural language or fuzzy inputs.
// Nearley grammar (compiled from .ne file)
@builtin "number.ne"
expression -> expression "+" expression {% ([a, _, b]) => a + b %}
| number
pegjs implements Parsing Expression Grammars (PEGs), which are unambiguous by design. Unlike CFGs used by the others, PEGs use ordered choice: the first matching rule wins. This eliminates ambiguity but means grammar order matters critically.
// PEG.js grammar
Expression
= left:Expression _ "+" _ right:Term { return left + right; }
/ Term
Term "term"
= [0-9]+ { return parseInt(text(), 10); }
_ "whitespace"
= [ \t]?
antlr4 requires you to write grammars in a separate .g4 file and run a Java-based tool (antlr4ts or antlr4) to generate JavaScript code. This adds build complexity but gives you access to ANTLR’s excellent grammar IDE plugins (for VS Code, IntelliJ) that highlight errors, visualize parse trees, and test inputs interactively.
jison lets you define grammars inline in JavaScript or in .jison files. However, error messages are often cryptic (“unexpected token”) and there’s little tooling to help debug shift/reduce conflicts. You’ll spend time tweaking precedence rules manually.
nearley ships with a simple CLI (nearleyc) to compile .ne files. Its error reporting is basic, but the ability to get multiple parse results helps diagnose ambiguity. The syntax is clean and feels native to JavaScript developers.
pegjs offers the smoothest DX for small projects: you can require('pegjs').generate(grammarString) at runtime (great for prototyping), or precompile for production. Error messages include expected tokens, and the online playground lets you test grammars instantly.
antlr4 produces relatively large parsers due to its generalized LL(*) engine and listener/visitor infrastructure. However, it’s highly optimized for correctness and includes customizable error listeners and token streams that support rewinding — essential for IDEs or linters.
jison generates compact, fast parsers because LALR(1) tables are efficient. But it fails hard on syntax errors — no recovery, no partial results. Not ideal for user-facing tools where typos are common.
nearley is slower than the others because Earley’s algorithm is O(n³) in worst-case scenarios. But for small inputs (under 1KB), it’s perfectly usable. Its real strength is incremental parsing: you can feed tokens one by one as the user types.
pegjs is fast and lightweight. Since PEGs are deterministic and greedy, parsing is linear in most practical cases. But beware: left recursion crashes the parser (though newer forks like peggy fix this).
antlr4 has official JavaScript runtimes and supports lexer modes, semantic predicates, and custom actions. You can share the same .g4 grammar across Java, Python, C#, and JS — great for full-stack teams.
jison is essentially a one-person legacy project. While it works, it hasn’t seen major updates in years. Avoid it for new projects unless you’re maintaining an old Bison port.
nearley integrates cleanly with streaming data and has utilities like moo for fast lexing. Its community is small but active, and it’s used in real products like the Ohm editor and math expression parsers.
pegjs is mature and stable. The original author archived it in favor of peggy, a faster, maintained fork that supports left recursion and better error reporting. For new work, consider peggy instead — but pegjs still works fine for simple cases.
| Scenario | Best Choice |
|---|---|
| Large, complex grammar with team collaboration | antlr4 — for its tooling, error recovery, and cross-language support |
| Porting a C/C++ Yacc/Bison grammar to JS | jison — but only if you can’t refactor to a modern alternative |
| Live editing, ambiguous input, or incremental parsing | nearley — unmatched for handling uncertainty |
| Simple, deterministic format (e.g., config files, queries) | pegjs (or peggy) — clean syntax, fast runtime, easy embedding |
For most frontend teams building a custom DSL or config parser, start with pegjs (or migrate to peggy). It’s the easiest to embed, debug, and deploy.
If you’re building a code editor, REPL, or natural-language interface where users make mistakes, go with nearley — its tolerance for ambiguity is a superpower.
Only reach for antlr4 if you’re building something large-scale (like a full programming language) and need enterprise-grade tooling and maintainability.
Avoid jison in new projects — its era has passed, and better alternatives exist.
Remember: the right parser isn’t about raw power — it’s about matching the tool to your grammar’s personality and your team’s workflow.
Choose nearley when your input might be ambiguous or incomplete (e.g., live syntax checking, natural language fragments, or incremental parsing). Its Earley-based approach handles left recursion gracefully and provides multiple parse results when ambiguity exists. Ideal for editors, REPLs, or experimental DSLs where flexibility trumps performance.
Choose antlr4 if you need industrial-strength parsing with strong IDE support, error recovery, and multi-language output (though you’ll use the JavaScript runtime). It’s best suited for large grammars where maintainability, debugging tooling, and grammar reuse matter more than minimal bundle size. Be prepared to work with Java-based tooling during development.
Choose pegjs for straightforward, unambiguous grammars where you want tight integration between parsing and semantic actions. Its PEG foundation guarantees one parse tree, and the syntax is clean and JavaScript-friendly. Great for JSON-like formats, simple query languages, or config parsers where you control both grammar and input structure.
Choose jison if you’re porting an existing Bison/Yacc grammar to JavaScript or need deterministic LALR(1) parsing with familiar syntax. It’s a solid choice for well-defined, non-ambiguous grammars like expression evaluators or config file parsers, but note that its ecosystem has slowed and it lacks modern developer experience features like good error messages or streaming support.
nearley is a simple, fast and powerful parsing toolkit. It consists of:
nearley is a streaming parser with support for catching errors gracefully and providing all parsings for ambiguous grammars. It is compatible with a variety of lexers (we recommend moo). It comes with tools for creating tests, railroad diagrams and fuzzers from your grammars, and has support for a variety of editors and platforms. It works in both node and the browser.
Unlike most other parser generators, nearley can handle any grammar you can define in BNF (and more!). In particular, while most existing JS parsers such as PEGjs and Jison choke on certain grammars (e.g. left recursive ones), nearley handles them easily and efficiently by using the Earley parsing algorithm.
nearley is used by a wide variety of projects:
nearley is an npm staff pick.
Please visit our website https://nearley.js.org to get started! You will find a tutorial, detailed reference documents, and links to several real-world examples to get inspired.
Please read this document before working on nearley. If you are interested in contributing but unsure where to start, take a look at the issues labeled "up for grabs" on the issue tracker, or message a maintainer (@kach or @tjvr on Github).
nearley is MIT licensed.
A big thanks to Nathan Dinsmore for teaching me how to Earley, Aria Stewart for helping structure nearley into a mature module, and Robin Windels for bootstrapping the grammar. Additionally, Jacob Edelman wrote an experimental JavaScript parser with nearley and contributed ideas for EBNF support. Joshua T. Corbin refactored the compiler to be much, much prettier. Bojidar Marinov implemented postprocessors-in-other-languages. Shachar Itzhaky fixed a subtle bug with nullables.
If you are citing nearley in academic work, please use the following BibTeX entry.
@misc{nearley,
author = "Kartik Chandra and Tim Radvan",
title = "{nearley}: a parsing toolkit for {JavaScript}",
year = {2014},
doi = {10.5281/zenodo.3897993},
url = {https://github.com/kach/nearley}
}