entities vs he vs html-entities
HTML 实体编码与解码工具库选型指南
entitieshehtml-entities类似的npm包:

HTML 实体编码与解码工具库选型指南

entitieshehtml-entities 都是用于处理 HTML 实体(如 &<— 等)的 JavaScript 库,主要功能包括将 HTML 实体解码为原始字符(decode),或将特殊字符编码为实体(encode)。它们广泛应用于富文本编辑器、模板引擎、XSS 防护、内容清洗等场景。这些库在 API 设计、功能覆盖、默认行为和可配置性方面存在显著差异,直接影响开发效率和运行时行为。

npm下载趋势

3 年

GitHub Stars 排名

统计详情

npm包名称
下载量
Stars
大小
Issues
发布时间
License
entities146,309,287375392 kB81 个月前BSD-2-Clause
he03,666-237 年前MIT
html-entities0687132 kB61 年前MIT

HTML 实体处理三剑客:entities vs he vs html-entities 深度对比

在前端开发中,处理用户输入、渲染富文本或防止 XSS 攻击时,经常需要将特殊字符转为 HTML 实体(如 <&lt;),或将实体还原为原始字符(如 &copy;©)。entitieshehtml-entities 是三个主流解决方案,但它们在设计理念、API 风格和行为细节上差异显著。本文从实战角度剖析三者异同,助你做出精准选型。

🔤 核心功能:编码与解码的基本用法

三者都提供 encodedecode 功能,但调用方式和默认行为不同。

entities 采用函数式风格,所有操作通过顶层函数完成:

import { encode, decode } from 'entities';

const encoded = encode('<div>© 2024</div>');
// '&lt;div&gt;&copy; 2024&lt;/div&gt;'

const decoded = decode('&lt;div&gt;&copy; 2024&lt;/div&gt;');
// '<div>© 2024</div>'

he 同样使用函数式 API,但方法名略有不同:

import { encode, decode } from 'he';

const encoded = encode('<div>© 2024</div>');
// '&lt;div&gt;&copy; 2024&lt;/div&gt;'

const decoded = decode('&lt;div&gt;&copy; 2024&lt;/div&gt;');
// '<div>© 2024</div>'

html-entities 则要求先创建特定类型的实体处理器实例:

import { Html5Entities } from 'html-entities';

const entities = new Html5Entities();

const encoded = entities.encode('<div>© 2024</div>');
// '&lt;div&gt;&copy; 2024&lt;/div&gt;'

const decoded = entities.decode('&lt;div&gt;&copy; 2024&lt;/div&gt;');
// '<div>© 2024</div>'

💡 注意:html-entities 还提供 AllHtmlEntitiesXmlEntitiesSvgEntities 等类,分别对应不同规范。

⚙️ 默认行为差异:如何处理边界情况?

当输入包含不完整或非标准实体时,三者表现迥异。

考虑以下畸形输入:'foo &amp bar'(缺少分号)。

entities 默认严格遵循规范,不会将其视为实体:

import { decode } from 'entities';
decode('foo &amp bar'); // 'foo &amp bar'(原样保留)

he 模拟浏览器行为,在宽松模式下会尝试修复:

import { decode } from 'he';
decode('foo &amp bar'); // 'foo & bar'(自动补全)

html-entities 行为取决于所选类,Html5Entities 默认也较宽松:

import { Html5Entities } from 'html-entities';
new Html5Entities().decode('foo &amp bar'); // 'foo & bar'

这种差异在处理用户粘贴的富文本时尤为关键 —— 若内容来自 Word 或旧系统,常含不规范实体,此时 hehtml-entities 的容错能力更可靠。

🛠️ 可配置性:精细控制转换规则

三者均支持选项配置,但粒度和方式不同。

控制编码范围

假设只想编码 <, >, &, ",而不碰其他字符(如 ©):

entities 通过 level 选项实现:

import { encode } from 'entities';
encode('© <script>', { level: 'basic' });
// '© &lt;script&gt;'(仅基础字符被编码)

he 使用 useNamedReferencesencodeEverything 等选项组合:

import { encode } from 'he';
encode('© <script>', {
  useNamedReferences: true,
  encodeEverything: false
});
// '© &lt;script&gt;'

html-entities 在实例化时传入选项:

import { Html5Entities } from 'html-entities';
const encoder = new Html5Entities({
  encodeEverything: false,
  useNamedReferences: true
});
encoder.encode('© <script>'); // '© &lt;script&gt;'

解码时区分属性值与文本内容

在 HTML 属性中,' 可表示为 &#39;,但在文本节点中可能无需处理。he 提供 isAttributeValue 选项:

import { decode } from 'he';
// 属性值上下文
decode('It&#39;s OK', { isAttributeValue: true }); // "It's OK"
// 文本上下文
decode('It&#39;s OK', { isAttributeValue: false }); // "It&#39;s OK"(不变)

entitieshtml-entities 无此细分,统一按标准解码。

🧪 特殊场景:数字引用与 Unicode 处理

对于十进制(&#8212;)或十六进制(&#x2014;)实体:

  • entities 默认解码所有数字引用,可通过 scope: 'strict' 限制。
  • he 总是解码合法数字引用,行为与浏览器一致。
  • html-entitiesHtml5Entities 下同样完全支持。
// 所有库均能正确处理
'&#8212;' → '—'(em dash)
'&#x2014;' → '—'

但若输入非法数字(如 &#999999999; 超出 Unicode 范围):

  • entities 返回空字符串或替换字符(取决于选项)。
  • he 返回原始字符串(不处理)。
  • html-entities 抛出错误或返回原始字符串(视版本而定)。

建议在处理不可信输入时,先验证再解码,避免意外行为。

📦 架构与适用场景总结

维度entitieshehtml-entities
API 风格函数式,简洁函数式,选项丰富面向对象,需实例化
默认宽容度严格(符合规范)宽松(模拟浏览器)中等(HTML5 类较宽松)
Bundle Size最小中等较大(多类导出)
典型用途现代 SPA、轻量级编码UGC 内容解析、高容错场景SSR、服务端长期任务

✅ 最佳实践建议

  • 前端项目优先考虑 entities:若你控制输入源(如内部 CMS),且追求最小依赖,entities 的简洁性和性能优势明显。
  • 处理用户粘贴内容选 he:当内容可能含畸形实体(如从邮件、Word 复制),he 的浏览器兼容性可减少显示异常。
  • 服务端复用配置用 html-entities:若需在多个地方使用相同编码规则(如同时处理 HTML 和 XML),其实例化模式便于封装。

无论选择哪个库,请始终记住:实体编码不是 XSS 防护的万能药。对于动态插入 HTML 的场景,应优先使用 textContent 或安全的 DOM API,而非依赖手动编码。

如何选择: entities vs he vs html-entities

  • entities:

    选择 entities 如果你需要一个轻量、高性能且专注于核心实体转换的库。它提供统一的 decodeencode 接口,支持 XML、HTML4、HTML5 多种模式,并允许通过选项精细控制是否处理数字引用、命名引用或特殊字符。适合对 bundle size 敏感或需要明确控制解析行为的现代前端项目。

  • he:

    选择 he 如果你希望获得最接近浏览器原生行为的 HTML 实体解码能力,尤其是处理边缘情况(如不完整或畸形实体)时。hedecode 方法默认更宽容,能正确还原大量非标准输入,且支持自定义解码选项(如 isAttributeValue)。适合内容来自不可信来源、需高容错解析的场景,如用户生成内容(UGC)展示。

  • html-entities:

    选择 html-entities 如果你需要面向不同上下文(HTML、XML、SVG 等)的专用编码/解码器实例,并偏好面向对象的 API 风格。该库通过 Html5EntitiesXmlEntities 等类提供预配置的转换器,适合服务端渲染或需要复用相同配置的长期任务。但注意其 API 较冗长,不适合轻量级前端使用。

entities的README

entities NPM version Downloads Node.js CI

Encode & decode HTML & XML entities with ease & speed.

Features

  • 😇 Tried and true: entities is used by many popular libraries; eg. htmlparser2, the official AWS SDK and commonmark use it to process HTML entities.
  • ⚡️ Fast: entities is the fastest library for decoding HTML entities (as of September 2025); see performance.
  • 🎛 Configurable: Get an output tailored for your needs. You are fine with UTF8? That'll save you some bytes. Prefer to only have ASCII characters? We can do that as well!

How to…

…install entities

npm install entities

…use entities

const entities = require("entities");

// Encoding
entities.escapeUTF8("&#38; ü"); // "&amp;#38; ü"
entities.encodeXML("&#38; ü"); // "&amp;#38; &#xfc;"
entities.encodeHTML("&#38; ü"); // "&amp;&num;38&semi; &uuml;"

// Decoding
entities.decodeXML("asdf &amp; &#xFF; &#xFC; &apos;"); // "asdf & ÿ ü '"
entities.decodeHTML("asdf &amp; &yuml; &uuml; &apos;"); // "asdf & ÿ ü '"

Performance

Benchmarked in September 2025 with Node v24.6.0 on Apple M2 using tinybench. Higher ops/s is better; avg (μs) is the mean time per operation. See scripts/benchmark.ts to reproduce.

Decoding

LibraryVersionops/savg (μs)±%slower
entities7.0.05,838,416175.570.06
html-entities2.6.02,919,637347.770.3350.0%
he1.2.02,318,438446.480.7060.3%
parse-entities4.0.2852,8551,199.510.3685.4%

Encoding

LibraryVersionops/savg (μs)±%slower
entities7.0.02,770,115368.090.11
html-entities2.6.01,491,963679.960.5846.2%
he1.2.0481,2782,118.250.6182.6%

Escaping

LibraryVersionops/savg (μs)±%slower
entities7.0.04,616,468223.840.17
he1.2.03,659,301280.760.5820.7%
html-entities2.6.03,555,301296.630.8423.0%

Note: Micro-benchmarks may vary across machines and Node versions.


FAQ

What methods should I actually use to encode my documents?

If your target supports UTF-8, the escapeUTF8 method is going to be your best choice. Otherwise, use either encodeHTML or encodeXML based on whether you're dealing with an HTML or an XML document.

You can have a look at the options for the encode and decode methods to see everything you can configure.

When should I use strict decoding?

When strict decoding, entities not terminated with a semicolon will be ignored. This is helpful for decoding entities in legacy environments.

Why should I use entities instead of alternative modules?

As of September 2025, entities is faster than other modules. Still, this is not a differentiated space and other modules can catch up.

More importantly, you might already have entities in your dependency graph (as a dependency of eg. cheerio, or htmlparser2), and including it directly might not even increase your bundle size. The same is true for other entity libraries, so have a look through your node_modules directory!

Does entities support tree shaking?

Yes! entities ships as both a CommonJS and a ES module. Note that for best results, you should not use the encode and decode functions, as they wrap around a number of other functions, all of which will remain in the bundle. Instead, use the functions that you need directly.


Acknowledgements

This library wouldn't be possible without the work of these individuals. Thanks to

  • @mathiasbynens for his explanations about character encodings, and his library he, which was one of the inspirations for entities
  • @inikulin for his work on optimized tries for decoding HTML entities for the parse5 project
  • @mdevils for taking on the challenge of producing a quick entity library with his html-entities library. entities would be quite a bit slower if there wasn't any competition. Right now entities is on top, but we'll see how long that lasts!

License: BSD-2-Clause

Security contact information

To report a security vulnerability, please use the Tidelift security contact. Tidelift will coordinate the fix and disclosure.

entities for enterprise

Available as part of the Tidelift Subscription

The maintainers of entities and thousands of other packages are working with Tidelift to deliver commercial support and maintenance for the open source dependencies you use to build your applications. Save time, reduce risk, and improve code health, while paying the maintainers of the exact dependencies you use. Learn more.