Convert Unicode to Bytes

In short

Convert Unicode to UTF-8 bytes in hex, decimal, or binary. Per-byte grid, reverse direction. Free, offline, client-side, instant, secure.

Runs in your browser
Nothing uploaded
Free, no sign-up

Convert Unicode text to raw UTF-8 bytes in three representations (hex, decimal, binary) with optional 0x / \x / 0b prefixes for direct paste into C, Python, or Rust source. Per-byte grid maps each byte back to its source character. Reverse direction auto-detects the format.

Unicode input

Byte format

Separator

Prefix

Uppercase hex

Bytes output

Per-byte breakdown

Type to begin.

🛡

100% PrivateNo server uploads, ever

⚡

InstantRuns in your browser

💧

No WatermarksClean output, always

🆓

Free ForeverNo accounts, no limits

How to Use Convert Unicode to Bytes

Paste your text. Anything Unicode - ASCII, accented Latin, CJK, emoji, mathematical symbols. The encoder uses TextEncoder.encode() to produce proper UTF-8 bytes (variable length: 1-4 bytes per character).
Pick the byte format. Hex (most common for debugging - 48 65 6C 6C 6F), decimal (good for human readability - 72 101 108 108 111), or binary (educational, raw bit view - 01001000 01100101). Switch formats and the same bytes are presented differently.
Choose a separator. Space (most readable, default), comma (CSV-compatible), newline (one byte per line for tall display), or none (continuous stream for hex pasting into a single value).
Add a prefix per byte. 0x for C/JavaScript/Python hex literals (0x48 0x65). x for C/Python string literals (x48x65 - combine with no separator). 0b for binary literals. None for raw bytes. Useful for pasting straight into source code as an array.
Toggle uppercase hex. Default uppercase (4F) matches Windows hex editors and macOS Hex Fiend. Lowercase (4f) matches Unix tools (hexdump, xxd) and most C/Rust source code conventions. The hex values are mathematically identical.
Read the per-byte grid. One row per UTF-8 byte showing index, the character that byte belongs to (first byte gets the char glyph, continuation bytes get ↳), hex, decimal, and binary. Multi-byte characters appear as 2-4 consecutive rows. Useful for understanding exactly how UTF-8 encodes a specific character.
Swap to decode. ⇄ flips to Bytes → Unicode. Auto-detects format: hex (chars A-F or 2-digit length), decimal (digits only), or binary (only 0/1 and multiples of 8). Strips common prefixes automatically (0x, x, 0b). Strict UTF-8 decoder catches invalid byte sequences.

Frequently Asked Questions

How does UTF-8 byte count vary with content?

UTF-8 is variable-length: ASCII (codepoints 0-127) = 1 byte each, Latin Extended / Greek / Cyrillic = 2 bytes, most CJK and other BMP characters = 3 bytes, supplementary planes (emoji, rare ideographs) = 4 bytes. So “Hello” is 5 bytes, but “Héllo” is 6 (the é is 2 bytes), and “🌍 Hello” is 11 bytes (4 for emoji + 1 space + 5 for ASCII).

What’s the difference between the three formats?

Same bytes, different presentations. Hex (base 16) is the most compact 2-char-per-byte format – universal in debugging tools and hex editors. Decimal (base 10) is easier for humans to read but takes 3 chars per byte. Binary (base 2) shows raw bits – best for teaching how UTF-8 encodes characters (lead byte patterns, continuation bytes), but verbose at 8 chars per byte.

What’s the difference between 0x and x prefixes?

Both indicate hex but in different contexts. 0x is the C/JavaScript/Python/Rust prefix for hex numeric literals: int x = 0x48;. x is the C/Python prefix for hex escapes inside string literals: "x48x65x6C". Use x with no separator to get a paste-able string literal: x48x65x6Cx6Cx6F = “Hello” in C/Python.

How does the reverse direction detect the format?

By scanning the first byte token: binary if it’s 8 characters of 0s and 1s, decimal if it’s all digits and short (≤3), hex if it contains A-F or is exactly 2 characters. Prefixes (0x, x, 0b) are stripped before detection. Mixed-format input is not supported – each conversion treats the entire input as one format.

What does the “↳” symbol mean in the grid?

It marks continuation bytes of a multi-byte UTF-8 character. The first byte gets the actual character glyph; subsequent bytes (in 2-4 byte sequences) get ↳ to visually show they’re part of the same character. Useful for understanding how an emoji like 😀 maps to 4 specific bytes: the first row shows “😀” and the next three show “↳”.

Why is my output the same in different prefix settings sometimes?

Because not all prefixes apply to all formats. 0x only prefixes hex. x only prefixes hex (it’s an escape syntax). 0b only prefixes binary. If you select x with decimal format, the prefix is silently skipped because there’s no x escape for decimal in any programming language. The dropdown could be smarter about hiding incompatible options, but cross-checking gives you flexibility.

What’s the input size limit?

200,000 characters. The per-byte grid caps at 256 displayed rows (since emoji-heavy text expands quickly – 50 emoji is already 200 bytes), but the conversion processes the full input. Copy and Download give you the complete output.

What’s the difference vs the “Convert String to Hex” tool?

That sibling tool focuses on hex output specifically with a per-character grid. This tool offers three byte representations (hex/decimal/binary) and a per-byte grid (one row per byte, not per character). Use the sibling for clean hex output, use this when you need decimal or binary, or when you want to see exactly how bytes line up with their source characters.

Is my text uploaded anywhere?

No. All UTF-8 encoding, byte formatting, and decoding run in your browser using TextEncoder and TextDecoder. Open DevTools → Network and confirm zero requests fire – even when you Convert or Download. Safe for tokens, credentials, or any sensitive text.

Does it work offline?

Yes. Total bundle is about 18 KB. Load once, disconnect, keep using. Pure JavaScript using browser standard library – no remote dependencies. Useful for debugging encoding issues on airgapped systems or in offline environments.

Keep going

Related Tools

All Unicode tools →

Embed this tool

Add this free tool to your website. Copy and paste the code:

<iframe src="https://alltoolsverse.com/tools/convert-unicode-to-bytes/?embed=1" width="100%" height="760" loading="lazy" style="max-width:900px;border:1px solid #e2e8f0;border-radius:12px" title="Convert Unicode to Bytes"></iframe>
<p>Free tool: <a href="https://alltoolsverse.com/tools/convert-unicode-to-bytes/">Convert Unicode to Bytes</a> by All Tools Verse</p>

Per-byte breakdown

Related Tools

Center Unicode Text →

Check Spoofed Unicode Text →

Chunkify Unicode Text →

ASCII to Unicode Converter →

Convert Code Points to Unicode →

Convert Unicode to ASCII →

Convert Unicode to Base64 →

Convert Unicode to Binary →

Convert Unicode to Code Points →

Convert Unicode to Data URL →

Convert Unicode to Decimal →

Convert Unicode to Hex →

Embed this tool