Home Tools Blog About

Convert UTF-8 to Bytes

In short

Convert UTF-8 text to bytes in decimal, hex, or binary with prefix and separator options. Bidirectional, emoji-safe. Free, client-side, instant, secure.

  • Runs in your browser
  • Nothing uploaded
  • Free, no sign-up

Convert UTF-8 text to bytes in 3 formats (decimal / hex / binary) with optional prefix (0x / \x / %), 4 separators, and per-character vs per-byte grouping. Bidirectional, strict UTF-8 validation on reverse.

Per-character breakdown

Type to begin.
🛡
100% PrivateNo server uploads, ever
InstantRuns in your browser
💧
No WatermarksClean output, always
🆓
Free ForeverNo accounts, no limits

How to Use Convert UTF-8 to Bytes

  1. Paste UTF-8 text. The tool runs TextEncoder to extract the raw byte sequence.
  2. Pick a format: hex (most common, compact), decimal (matches byte arrays in some languages), binary (8-bit pattern).
  3. For hex output, pick a prefix style: none (just F0), 0x for language literals, x for C/Python escapes, % for URL-encoding.
  4. Choose a separator and grouping. Byte groups each byte separately; Char concatenates the bytes belonging to each codepoint into one token (useful for seeing per-character cost).
  5. Swap direction to decode tokens back to text - format auto-detected (binary if 8 chars of 0/1, hex if hex letters or 2 chars, else decimal).

Frequently Asked Questions

How does UTF-8 encode multi-byte characters?

Codepoints U+0000-U+007F use 1 byte (ASCII), U+0080-U+07FF use 2 bytes, U+0800-U+FFFF use 3 bytes, U+10000-U+10FFFF use 4 bytes. The first byte’s high bits signal the width: 0xxxxxxx / 110xxxxx / 1110xxxx / 11110xxx; continuation bytes start with 10xxxxxx.

Why is 🌍 four bytes but A is one?

UTF-8 is variable-width. A (U+0041) fits in 7 bits → 1 byte. 🌍 (U+1F30D) needs 17 bits → 4 bytes. Stats show the per-width distribution.

Which prefix should I use?

0x matches most language hex literals (JS, Python, C, Go). x matches C/C++/Python/Rust string-byte escapes. % matches URL percent-encoding. None = plain space-separated bytes for network analyzers.

What does “Group by char” do?

Default Byte mode produces one token per UTF-8 byte. Char mode concatenates the bytes belonging to each codepoint into a single token, so the output naturally shows per-character cost: A41; éC3A9; 🌍F09F8C8D.

How does decoding work?

The decoder auto-detects format by the first token (binary if 8 chars of 0/1, hex if hex letters or 2 chars, else decimal), strips any prefix, parses each token as a byte value 0-255, and runs TextDecoder('utf-8', {fatal: true}) – invalid UTF-8 throws rather than silently substituting U+FFFD.

Does ASCII get 1 byte per character?

Yes – UTF-8 is identical to ASCII for codepoints U+0000-U+007F. This is by design (Ken Thompson’s UTF-8 spec) and is why UTF-8 became the web’s standard encoding.

Can I use this for file sizes?

Yes – the byte count is exactly the storage size for a plain UTF-8 text file. Actual files may add a BOM (3 bytes) or have headers; this tool just measures the text content.

Is text uploaded?

No. TextEncoder / TextDecoder run in the browser.

What’s the input cap?

200,000 characters. Lower cap keeps the UI responsive.

How does this compare to the UTF-8 to Binary tool?

The Binary tool is dedicated to 8-bit binary output with UTF-8 prefix-bit highlighting in its per-character grid. This Bytes tool exposes all three formats (decimal/hex/binary) plus prefix options for output-format flexibility.

Keep going

Related Tools

All Utf8 tools →

Convert Bytes to UTF-8

Convert Bytes to UTF-8 Decode decimal/hex/binary byte values to UTF-8 text - emoji, CJK,…

Binary to UTF-8 Decoder

Binary to UTF-8 Text Decoder handles emoji, CJK, accents, strips BOM, counts replacement chars.…

Convert Arbitrary Base to UTF-8

Decode numeric tokens in any base (2-36) as UTF-8 bytes - multi-byte emoji and…

Base64 to UTF-8 Decoder

Decode Base64 to UTF-8 text - handles emoji, CJK, BOM-stripping, URL-safe variants. Free, client-side,…

Code Points to UTF-8 Converter Free

Free online Unicode code points to UTF-8 converter. Shows actual UTF-8 byte sequences per…

Convert Data URI to UTF-8

online Data URI to UTF-8 decoder with byte-breakdown panel for emoji and CJK. Client-side,…

Convert Decimal to UTF-8

online decimal to UTF-8 text decoder. Byte-mode (raw UTF-8 bytes) and codepoint-mode. Client-side, instant,…

Convert Hexadecimal to UTF-8

Decode hex to UTF-8 text with byte-structural breakdown. Handles ASCII, Latin, CJK, emoji. Batch…

Convert HTML Entities to UTF-8

Decode HTML entities to UTF-8 with per-character byte breakdown. Named, decimal, hex. Free, offline,…

Convert Octal to UTF-8

Decode octal byte sequences to UTF-8 text, encode UTF-8 to octal. C-escape support, multi-byte.…

Convert UTF-16 to UTF-8

Convert UTF-16 code units to UTF-8 text and bytes. 3 formats, BE/LE, BOM, surrogate…

Convert UTF-32 to UTF-8

Convert UTF-32 code points to UTF-8 text and bytes. 3 formats, BE/LE, BOM, strict…

Share

Embed this tool

Add this free tool to your website. Copy and paste the code: