HTML Unescape Online — Decode HTML Entities

Decode named entities (<, &,  ) and numeric references ( ,  ) back to original characters — 100% in your browser.

What is HTML Unescaping?

HTML unescaping reverses HTML entity encoding. It scans your input for named entities (<, &,  ), decimal numeric references (€) and hexadecimal references (€) and replaces each one with its corresponding character.

You need this when consuming HTML-encoded data from RSS feeds, scraped pages, legacy CMSes, or older APIs that wrap text in entities even though the transport already supports UTF-8. The OpenFormatter HTML unescape tool runs entirely in your browser — paste in encoded text, get plain Unicode out.

How to unescape HTML online — 4 steps

  1. Paste the encoded HTML. Drop a string containing entities like <, ", or € into the Input panel.
  2. Click Unescape. The decoder recognises named entities, decimal references, and hexadecimal references in a single pass — with & handled last to avoid double-decoding.
  3. Inspect the plain text. Every &lt; becomes <, every &#8364; becomes €, and Unicode is fully restored.
  4. Use the result safely. Treat the output as plain text. To display it on a page use textContent, not innerHTML — the decoded characters now have HTML meaning again.

Sample input and output

Encoded input

&lt;div class=&quot;alert&quot;&gt;
  &lt;p&gt;Hello, &quot;World&quot; &amp; welcome!&lt;/p&gt;
  &lt;a href=&#39;/path&#39;&gt;Click&nbsp;here&lt;/a&gt;
  Copyright &copy; 2024 &mdash; &#8364;100
&lt;/div&gt;

Decoded output

<div class="alert">
  <p>Hello, "World" & welcome!</p>
  <a href='/path'>Click here</a>
  Copyright © 2024 — €100
</div>

Named + Numeric

Decodes the core five entities, common HTML named entities (nbsp, copy, mdash, hellip), and every decimal or hexadecimal numeric reference.

Full Unicode

Numeric references for codepoints above U+FFFF (emoji, supplementary planes) are decoded with String.fromCodePoint and emitted as correct surrogate pairs.

Browser-Only

Decoding happens locally in JavaScript. Strings from logs, feeds, or internal APIs never leave your machine.

Common use cases

  • check_circleDecoding HTML-encoded values returned by legacy CMS APIs (WordPress, Drupal exports)
  • check_circleUnwrapping RSS/Atom titles and descriptions that contain &amp; and &lt;
  • check_circleCleaning scraped HTML body text for full-text search indexing
  • check_circleRestoring readable strings from HTML-escaped database columns
  • check_circleReading stack traces or log lines that were HTML-encoded for a web view
  • check_circleConverting encoded email subject lines to plain text for matching
  • check_circleDecoding e-commerce product titles and descriptions exported from third-party feeds
  • check_circlePre-processing user-generated content from old forum software before re-display

Unescape vs DOMParser vs textarea trick

Three browser approaches decode entities. The textarea trick (assigning to innerHTML then reading value) works but executes the HTML parser, which can be exploited if the input is attacker-controlled. DOMParser is safer but slow on large strings because it builds a real document tree. A pure regex-based unescape, like this tool, is the fastest and never invokes the HTML parser — making it predictable and safe even on hostile input. The trade-off: regex unescape only knows the named entities it has been taught, falling back to numeric references for everything else.

Need to escape instead?

Use the HTML Escape tool to encode characters, or browse the full set of escape and unescape utilities.

Frequently Asked Questions

What types of entity references does this decode?

Three kinds. Named entities like `&amp;`, `&nbsp;`, `&copy;`, `&mdash;`. Decimal numeric references such as `&#8364;` for the euro sign. Hexadecimal numeric references such as `&#x20AC;` for the same character. All three forms decode to the same UTF-8 character.

Why does the order of unescaping matter?

You must decode `&amp;` LAST. If you decoded it first, the encoded sequence `&amp;lt;` would become `&lt;` and then the next pass would decode that to a literal `<` — the original ampersand is lost. Decoding `&lt;`, `&gt;`, the named entities, and numeric references first, then `&amp;`, prevents this double-decode bug.

Will the decoded output be safe to inject into innerHTML?

No. After decoding you have raw `<`, `>`, `&` characters that the browser would parse as markup. Treat the output as plain text — assign it to `el.textContent`, not `el.innerHTML`. If you do need HTML, run the result through a sanitiser like DOMPurify.

Can it handle double-encoded strings?

Yes — run it twice. A double-encoded string like `&amp;amp;lt;` becomes `&amp;lt;` after the first pass, then `<` after the second. Double-encoding usually indicates a bug upstream where data was escaped twice; fixing the source is preferable to compensating with multiple decodes.

What happens to unknown named entities?

The tool decodes the most common HTML named entities — the core five plus nbsp, copy, reg, trade, mdash, ndash, hellip — and falls back to numeric reference decoding (`&#NNN;` and `&#xHH;`) which covers every Unicode codepoint. Unrecognised named entities are left untouched in the output.

Is this the same as DOMParser-based decoding?

Functionally similar but safer. `new DOMParser().parseFromString(s, "text/html").documentElement.textContent` would also decode entities — but it builds a real DOM tree from your input, which executes some HTML parsing rules and can be slow on large strings. This tool is a pure string operation.

How are emoji and supplementary-plane characters decoded?

Numeric references for codepoints above U+FFFF (such as `&#128512;` for 😀) are decoded using `String.fromCodePoint`, which correctly emits the surrogate pair for JavaScript strings. The output is a single visual character even though it occupies two UTF-16 code units.

Should I run this on data fetched from a JSON API?

Only if the API explicitly HTML-encoded its values — some legacy CMS APIs do, returning `&amp;` instead of `&`. Modern JSON APIs send raw Unicode and JSON-escape only the JSON-required characters. Decoding raw JSON values as HTML can corrupt strings that legitimately contain `&` followed by letters.

HTML Unescape Online — Decode HTML Entities