Convert UTF-8 to Hex
Convert UTF-8 text to hexadecimal (5 formats, case, grouping). Bidirectional. Emoji-safe. Free, client-side, instant, secure.
Convert UTF-8 text to hexadecimal bytes in 5 formats - continuous, space-separated, colon-separated, 0x-prefixed, and C-style array. Bidirectional, multi-byte safe. Note: UTF-8 has no endianness, so there is no byte-order option to worry about.
Per-character breakdown
How to Use Convert UTF-8 to Hex
- Paste UTF-8 text - ASCII, accented, CJK, emoji.
- Pick a format: space is human-readable; continuous is compact; colon matches MAC-address style; 0x-prefixed matches most language hex literals; C-style array is ready to paste into source code.
- Choose uppercase or lowercase for A-F.
- Optionally group bytes:
2joins bytes into 4-digit words,4into 8-digit dwords, etc. - Swap to decode: the parser strips braces, commas,
0xprefixes, colons, and whitespace; what's left must be hex digits. Bytes go through fatal-modeTextDecoderso invalid UTF-8 throws.
Frequently Asked Questions
How does UTF-8 to hex work?
Each UTF-8 byte (0-255) becomes a 2-character hex token (00-FF). ASCII characters get 1 byte → 2 hex chars; accented Latin 2 bytes → 4 hex chars; CJK 3 bytes → 6 hex chars; emoji 4 bytes → 8 hex chars.
Why has the “Little-endian” option been removed?
Because it was incorrect. UTF-8 has no endianness – it’s a byte stream by spec design (see RFC 3629). Endianness only matters for fixed-width multi-byte units like UTF-16 or UTF-32; UTF-8’s single-byte stream eliminates the problem by design.
What’s the difference between the 5 output formats?
Continuous: no separator (48656C6C6F) – compact for embedding. Space: human-readable (48 65 6C 6C 6F). Colon: MAC-address style (48:65:6C:6C:6F). 0x-prefixed: language-literal style (0x48 0x65 0x6C 0x6C 0x6F). C-array: ready to paste ({0x48, 0x65, 0x6C, 0x6C, 0x6F}).
How does the reverse decoder handle different formats?
It strips everything but hex digits – braces, commas, 0x prefixes, colons, spaces, newlines. What remains must be an even number of hex digits, which it groups into bytes and feeds to fatal-mode TextDecoder.
Why does emoji work here when some other tools fail?
This tool uses TextEncoder to extract the UTF-8 byte sequence first. Simple implementations that loop over charCodeAt see UTF-16 code units (with surrogate pairs for emoji) instead of UTF-8 bytes, producing wrong output.
What does “group by N” do?
Joins consecutive bytes into N-byte clusters before applying the separator. For example, “Hello” space-separated with group-by-2: 4865 6C6C 6F (4 chars + 4 chars + 2 chars). Useful for visualising 16-bit words or 32-bit dwords in protocol dumps.
Are invalid UTF-8 bytes silently fixed?
No. The decoder uses fatal mode – invalid sequences (lone continuations, truncated leaders, encoded surrogates, overlong forms) throw an explicit error rather than producing U+FFFD replacement characters.
Is text uploaded anywhere?
No. TextEncoder / TextDecoder run in the browser.
What’s the input cap?
200,000 characters. Hex output grows to 2× the byte count.
How does this compare to the UTF-8 to Bytes tool?
The Bytes tool exposes hex AND decimal AND binary plus optional prefixes. This Hex tool is hex-specific and adds 5 output formats (continuous / space / colon / 0x / C-array) for direct paste into different contexts.