Convert HTML Entities to UTF-8

In short

Decode HTML entities to UTF-8 with per-character byte breakdown. Named, decimal, hex. Free, offline, client-side, instant, secure.

Runs in your browser
Nothing uploaded
Free, no sign-up

Decode named (©), decimal (©), and hex (©) entities to full Unicode. Per-character breakdown shows 1/2/3/4-byte UTF-8 classification and flags supplementary-plane emoji.

Input with HTML Entities

Show per-character UTF-8 breakdown

Decoded UTF-8 Output

Enter HTML entities to decode.

🛡

100% PrivateNo server uploads, ever

⚡

InstantRuns in your browser

💧

No WatermarksClean output, always

🆓

Free ForeverNo accounts, no limits

How to Use Convert HTML Entities to UTF-8

Convert HTML Entities to UTF-8 · free online tool · All Tools Verse — Convert HTML Entities to UTF-8. Free online tool that runs in your browser.

Paste text with entities - named (©, —), decimal (©), or hex (©). The tool hands the input to a detached textarea, so you get the browser's full HTML5 entity table (2,200+ names) without shipping any dictionary.
Read the decoded output - full Unicode is preserved: emoji, CJK, mathematical symbols, combining marks, every code point renders natively. No stripping, no placeholders.
Study the breakdown - each code point gets a row: the character, a 1B/2B/3B/4B byte-length tag, the U+XXXX code point, and the raw UTF-8 hex bytes (e.g., € = U+20AC = E2 82 AC). Supplementary-plane characters (emoji, rare CJK) are labeled.
Check the stats - total entities found + N/D/H type split + character count + total UTF-8 byte count + BMP vs supplementary-plane split. Quick way to see if your input has emoji that'll inflate a UTF-8 payload.
Copy or download - Copy writes the decoded text to the clipboard; Download saves decoded-utf8.txt. Ctrl+Enter (⌘+Enter) triggers a recompute. 200 ms debounce on input keeps typing smooth.
Trust the safety - decoding uses a detached <textarea>'s .value. That's the same decoder every browser ships, but it never runs your input as HTML. <script>alert(1)</script> stays as text.

Frequently Asked Questions

What’s the difference between this and the ASCII-category tool?

Both decode entities. The ASCII version has an option to replace non-ASCII decodes with [U+XXXX] placeholders (for strict ASCII pipelines). This UTF-8 version keeps the full Unicode output and adds per-character UTF-8 byte analysis – which character took 1, 2, 3, or 4 bytes.

Why would I care about UTF-8 byte length?

UTF-8 is variable-width: ASCII is 1 byte, Latin extended (©, é) is 2, most Asian scripts and symbols (€, 中) are 3, emoji and supplementary-plane characters are 4. If you’re sizing a database field, a tweet, or a fixed-width protocol, the character count lies – the byte count tells the truth.

Does it support emoji and supplementary-plane characters?

Yes. 😀 → 😀 (U+1F600, 4-byte UTF-8 F0 9F 98 80). JavaScript stores supplementary-plane characters as surrogate pairs internally (two UTF-16 units), but our breakdown uses code-point iteration so the emoji shows as ONE row with length 4, not two rows.

How are the UTF-8 bytes computed?

From the Unicode code point using the standard encoding rule: 0-0x7F → 1 byte, 0x80-0x7FF → 2 bytes (110xxxxx 10xxxxxx), 0x800-0xFFFF → 3 bytes (1110xxxx 10xxxxxx 10xxxxxx), 0x10000-0x10FFFF → 4 bytes. No fancy libraries – just bit-shifting and masking.

Is the decoding XSS-safe?

Yes. We write the raw input to a detached <textarea>‘s innerHTML, then read its .value. The textarea element never parses its contents as HTML – it just entity-decodes. <script> tags survive the round-trip as literal text, no execution.

What about unknown or malformed entities?

&bogus; survives verbatim in the output. The browser’s decoder drops unknown sequences back through, so you never lose data – you just see the same text you typed.

Can it decode double-encoded HTML?

One pass per decode. &copy; becomes ©. Paste that back in and decode again to get ©. Deliberate – auto-looping would break inputs where & is the intentional end state.

Is my input sent to a server?

No. Zero network requests. The browser’s built-in entity decoder plus a few JavaScript functions. Open DevTools → Network and watch nothing fire after the page loads. Safe for scraped pages, customer records, internal tooling.

Does it work offline?

Yes. The whole tool is under 20 KB of HTML+CSS+JS. Once loaded, disconnect Wi-Fi and keep decoding. Bookmark and use on air-gapped boxes.

How large an input can it handle?

Typical inputs decode in under 50 ms. 100 KB of entity-heavy HTML decodes in roughly 25 ms. The breakdown panel caps at 60 rows to keep DOM work fast; the output and stats always reflect the full decode.

Keep going

Related Tools

All Utf8 tools →

Embed this tool

Add this free tool to your website. Copy and paste the code:

<iframe src="https://alltoolsverse.com/tools/convert-html-entities-to-utf8/?embed=1" width="100%" height="760" loading="lazy" style="max-width:900px;border:1px solid #e2e8f0;border-radius:12px" title="Convert HTML Entities to UTF-8"></iframe>
<p>Free tool: <a href="https://alltoolsverse.com/tools/convert-html-entities-to-utf8/">Convert HTML Entities to UTF-8</a> by All Tools Verse</p>

Related Tools

Convert UTF-8 to HTML Entities →

Escape HTML Entities →

Binary to UTF-8 Decoder →

Convert Arbitrary Base to UTF-8 →

Base64 to UTF-8 Decoder →

Convert Bytes to UTF-8 →

Code Points to UTF-8 Converter Free →

Convert Data URI to UTF-8 →

Convert Decimal to UTF-8 →

Convert Hexadecimal to UTF-8 →

Convert Octal to UTF-8 →

Convert UTF-16 to UTF-8 →

Embed this tool