Convert Data URI to UTF-8

In short

online Data URI to UTF-8 decoder with byte-breakdown panel for emoji and CJK. Client-side, instant, secure - no uploads.

Runs in your browser
Nothing uploaded
Free, no sign-up

Decode a data: URI and view the UTF-8 text with an optional per-codepoint byte breakdown. Handles Base64 and URL-encoded payloads and correctly surfaces emoji, CJK, and multi-byte characters.

Data URI

Strip UTF-8 BOM if present

Show per-codepoint byte breakdown

Decoded UTF-8 text

🛡

100% PrivateNo server uploads, ever

⚡

InstantRuns in your browser

💧

No WatermarksClean output, always

🆓

Free ForeverNo accounts, no limits

How to Use Convert Data URI to UTF-8

Convert Data URI to UTF-8 · free online tool · All Tools Verse — Convert Data URI to UTF-8. Free online tool that runs in your browser.

Paste the full URI into the input. It must start with data: and contain a comma separating the header from the payload.
The tool auto-detects whether the payload is Base64 or percent-encoded from the URI header, then decodes the raw bytes.
Decoding is always UTF-8. If the URI declared a different charset, the stats line calls that out - you'll see a "⚠ declared charset: windows-1252 (decoded as UTF-8)" note. For honest-to-the-charset decoding use the generic Data URI to ASCII decoder.
Enable the byte breakdown to see each code point's UTF-8 byte sequence. A single emoji like 😀 shows up as 4 bytes (F0 9F 98 80); a Latin-1 character like é shows 2 bytes (C3 A9). This is how you see why some strings are longer than they look.
Optional: Strip UTF-8 BOM. If the decoded text starts with U+FEFF (EF BB BF bytes), enabling this removes it so the result is clean.
Read the stats. You get byte-class counts like 1-byte:5 3-byte:2 - handy when sizing payloads or debugging unexpected sizes.
Copy or download. Copy places the decoded text on your clipboard. Download saves a data-uri-utf8-*.txt file.

Frequently Asked Questions

How is this different from the generic Data URI to ASCII decoder?

The generic decoder honours whatever charset the URI declares (UTF-8, Windows-1252, ISO-8859-1, etc.) – it converts the bytes using that charset. This tool always decodes as UTF-8 and additionally exposes a per-codepoint UTF-8 byte breakdown. Pick this one when you know the payload is UTF-8 and want to see the byte-level structure; pick the generic one when the source charset is legacy or unknown.

Why does my UTF-8 emoji take 4 bytes but show as 1 character?

Because UTF-8 packs larger Unicode code points into more bytes. U+1F600 (😀) sits in the supplementary planes above U+FFFF, which requires UTF-8’s 4-byte form (F0 9F 98 80). The decoded string has 1 code point, but any string-length measurement that counts code units (JavaScript’s .length) will say 2, and the UTF-8 byte count is 4. The breakdown panel exposes all three numbers.

What are the 1/2/3/4-byte UTF-8 classes?

UTF-8 uses variable-width encoding: U+0000-U+007F is 1 byte (plain ASCII), U+0080-U+07FF is 2 bytes (Latin extended, Greek, Cyrillic, Arabic), U+0800-U+FFFF is 3 bytes (CJK, most BMP characters), U+10000-U+10FFFF is 4 bytes (emoji, supplementary scripts). The stats panel breaks down your payload by these classes.

What do U+FFFD replacement characters in the output mean?

The decoder hit bytes that aren’t valid UTF-8. Most common cause: the source was actually some other charset (e.g., Windows-1252) but this tool always assumes UTF-8. If you see lots of U+FFFD, try the generic Data URI to ASCII decoder with the original charset, or check whether the URI was truncated mid-sequence.

How do I know if my URI has a UTF-8 BOM?

After decoding, check if the output starts with U+FEFF (a zero-width no-break space that serves as a byte-order mark). Tick the “Strip UTF-8 BOM” box to remove it automatically. In Base64, the BOM appears as the prefix 77u/ (which decodes to EF BB BF).

Does this handle URL-safe Base64 inside a data URI?

Yes. RFC 2397 technically specifies the standard Base64 alphabet (+ and /), but URL-safe Base64 (- and _) leaks into data URIs that were built for web contexts. This tool normalises both alphabets and auto-pads missing = characters.

Why is the byte count different from my string’s length?

Because UTF-8 bytes, UTF-16 code units (what JavaScript’s .length counts), and Unicode code points (Array.from(s).length) are three different things. For ASCII-only text they match. For anything else they diverge: “é” is 2 UTF-8 bytes but 1 code point and 1 code unit; “😀” is 4 UTF-8 bytes, 1 code point, but 2 UTF-16 code units. The breakdown panel shows all the math.

What happens with URL-encoded payloads and UTF-8?

Percent-encoded multi-byte UTF-8 is handled correctly. data:,%E6%97%A5%E6%9C%AC (6 percent-escapes, 18 chars of payload) decodes to 日本 (2 code points, 6 UTF-8 bytes). decodeURIComponent is UTF-8-aware by default, so this just works.

Can I decode a Windows-1252 or Latin-1 data URI with this tool?

You can, but you’ll get U+FFFD replacement characters wherever the bytes don’t form valid UTF-8. The tool flags that in stats. For lossless legacy-charset decoding, use the generic Data URI to ASCII decoder which honours the declared charset. This tool is UTF-8-only by design so you can see the byte structure clearly.

Is it free, offline, and private?

Yes. Decoding uses the browser’s native atob, decodeURIComponent, and TextDecoder. Nothing is uploaded, nothing is logged, no account is needed. Load the page once and the tool works offline indefinitely.

Keep going

Related Tools

All Utf8 tools →

Embed this tool

Add this free tool to your website. Copy and paste the code:

<iframe src="https://alltoolsverse.com/tools/convert-data-uri-to-utf8/?embed=1" width="100%" height="760" loading="lazy" style="max-width:900px;border:1px solid #e2e8f0;border-radius:12px" title="Convert Data URI to UTF-8"></iframe>
<p>Free tool: <a href="https://alltoolsverse.com/tools/convert-data-uri-to-utf8/">Convert Data URI to UTF-8</a> by All Tools Verse</p>

Related Tools

Convert UTF-8 to Data URI →

Binary to UTF-8 Decoder →

Convert Arbitrary Base to UTF-8 →

Base64 to UTF-8 Decoder →

Convert Bytes to UTF-8 →

Code Points to UTF-8 Converter Free →

Convert Decimal to UTF-8 →

Convert Hexadecimal to UTF-8 →

Convert HTML Entities to UTF-8 →

Convert Octal to UTF-8 →

Convert UTF-16 to UTF-8 →

Convert UTF-32 to UTF-8 →

Embed this tool