Convert UTF-8 to Data URI

Convert UTF-8 text to data: URI (percent or base64), 9 MIME types + custom, charset toggle. Bidirectional. Free, client-side, instant, secure.

Convert UTF-8 text to a data: URI using percent-encoded or base64 body, with 9 common MIME types (plus custom) and optional charset declaration. Swap to decode any data URI shape back to text. Multi-byte characters and emoji round-trip correctly.

Per-character breakdown (first 256 chars)

Type to begin.

How to Use Convert UTF-8 to Data URI

  1. Paste UTF-8 text - plain text, HTML, CSS, JSON, SVG, anything.
  2. Pick a MIME type from the 9 presets or choose Custom to enter your own (e.g. text/yaml).
  3. Pick encoding: Percent-encoded is human-readable and compact for mostly-ASCII text; Base64 is uniform and safe for any binary-ish content but adds ~33% size overhead.
  4. Toggle ;charset=utf-8 - strictly optional. Most modern browsers default to UTF-8 anyway, but explicit charset declaration removes ambiguity for older parsers.
  5. Swap to decode: paste any data URI shape (percent or base64, with or without charset) and recover the text.

Frequently Asked Questions

What’s a data URI?

A URI scheme that embeds the content directly in the URL. Format: data:<mime>[;charset=...][;base64],<body>. RFC 2397 defines it. Used to embed small images, fonts, SVG, or text inline without a separate HTTP request – useful for self-contained HTML/email/standalone pages.

Percent vs base64 – which to choose?

Percent-encoding leaves ASCII unchanged and only escapes special bytes – readable, ~1.3-3× the original for mixed content. Base64 always produces a uniform alphabet of 64 chars + 4/3 overhead. Base64 is better for binary or heavily non-ASCII content (each non-ASCII byte becomes 3 chars in percent: %XX); percent is better for mostly-ASCII text.

Why does emoji work here?

Both encoders go through TextEncoder first to get the UTF-8 byte sequence. Percent calls encodeURIComponent which produces %F0%9F%8C%8D for 🌍 (4 bytes). Base64 packs the bytes through btoa after Latin-1-bridging via String.fromCharCode. Naive implementations that pass strings straight to encodeURIComponent or btoa would break.

Do I need the charset?

Probably not – HTML5 and most modern browsers assume UTF-8 by default. The charset parameter is most useful when the consumer is an older parser or you want explicit documentation. Toggle off to shorten the URI by 14 characters.

What MIME types does it support?

9 presets cover the common cases: text/plain, text/html, text/css, application/json, image/svg+xml, application/javascript, text/csv, application/xml, text/markdown. Custom mode accepts any MIME string for niche formats like text/yaml or vendor-specific types.

How does the decoder handle the format?

It splits on the first comma to isolate header and body, parses semicolon-separated header parts for MIME / base64 flag / charset, then runs atob + fatal-mode TextDecoder (base64) or decodeURIComponent (percent). Throws on missing data: prefix, missing comma, invalid base64, or non-UTF-8 bytes.

Is the URI size limited?

Browsers historically had URI length limits (~2 KB in IE), but Chrome/Firefox/Safari accept multi-megabyte data URIs in src / href attributes. Email clients are stricter. Keep it under ~16 KB for broad compatibility, under ~2 MB for browser use.

Is the data URI cacheable?

No – data URIs are embedded in their parent document, so they share the parent’s cache. They re-download every time the parent does. For frequently-used assets, an external file with HTTP caching beats inlining.

Is base64 secure?

Not at all. Base64 is encoding, not encryption – anyone can decode it instantly. Don’t put secrets in a data URI; treat the body as fully public to anyone reading the URL.

Can I use this for images?

Only text-based images like SVG. For raster images (PNG/JPEG/WebP), you need a binary-to-base64 tool that reads file bytes – this tool starts from UTF-8 text.