Convert UTF-8 to Decimal
Convert UTF-8 text to decimal bytes or codepoints with padding and separators. Bidirectional. Free, client-side, instant, secure.
- Runs in your browser
- Nothing uploaded
- Free, no sign-up
Convert UTF-8 text to decimal numbers - either one decimal per UTF-8 byte (0-255) or one decimal per Unicode codepoint (0-1,114,111). Bidirectional with auto-detection: paste decimals back and the decoder picks byte vs codepoint mode by the maximum value.
Per-character breakdown
How to Use Convert UTF-8 to Decimal
- Paste UTF-8 text.
- Pick Byte mode (one decimal per UTF-8 byte, range 0-255) or Codepoint mode (one decimal per character, range 0-1114111).
- Pick separator and padding. Padding=3 keeps byte columns aligned (
065 105); padding=7 covers the maximum codepoint U+10FFFF. - The grid shows both byte and codepoint decimals side-by-side so the difference is visible for multi-byte chars.
- Swap to decode: the parser auto-detects byte vs codepoint mode by whether any token exceeds 255.
Frequently Asked Questions
What’s the difference between byte and codepoint mode?
A codepoint is the abstract Unicode identifier; a UTF-8 byte is one of 1-4 bytes that serializes the codepoint. 🌍 is ONE codepoint (127,757 decimal = U+1F30D) but FOUR UTF-8 bytes (240 159 140 141 decimal). Byte mode outputs the bytes; codepoint mode outputs the codepoints.
Why is 🌍 four numbers in byte mode but one in codepoint mode?
UTF-8 splits non-ASCII codepoints across multiple bytes. 🌍 (codepoint 127,757) is too large for one byte (max 255), so UTF-8 spreads its bits across 4 continuation bytes. Codepoint mode skips this serialization and reports the codepoint directly.
How does auto-detection work on reverse?
If ANY token exceeds 255, the decoder treats all tokens as codepoints (since byte values can’t exceed 255). Otherwise it treats them as bytes and runs them through fatal-mode UTF-8 decode. This is mostly reliable but ambiguous for short ASCII-only inputs – codepoint and byte modes both produce identical output for pure ASCII text.
What if my codepoint value is invalid?
Values above 1,114,111 (U+10FFFF) and surrogates (55296-57343 / U+D800-U+DFFF) throw with the token position. Surrogates are reserved exclusively for UTF-16 encoding and shouldn’t appear as standalone codepoints.
What’s padding for?
Pads each decimal with leading zeros to a minimum width. Padding=3 keeps byte tokens aligned (065 105 032 240) for column-oriented displays. Padding=7 covers the maximum 7-digit codepoint (0000065 0127757). None = natural width.
Is byte mode the same as ASCII codes?
For ASCII characters (U+0000-U+007F), yes – byte mode produces the same numbers as classic ASCII code references. For non-ASCII characters, UTF-8 byte values diverge significantly from any single “code” representation.
How do I get classic ASCII-only output?
Byte mode and codepoint mode produce identical output for ASCII text. If you want a guaranteed ASCII-only output that errors on non-ASCII input, this tool doesn’t do that – use the UTF-8 to ASCII tool with the “strip” fold mode for that behaviour.
Is text uploaded?
No. TextEncoder / TextDecoder run in the browser.
What’s the input cap?
200,000 characters. The decimal output can grow to ~4× the input (4-digit bytes per multi-byte char), so this protects against tab-freezes.
How does this compare to the UTF-8 to Bytes tool?
The Bytes tool exposes hex, decimal, AND binary plus prefix options. This Decimal tool is decimal-specific and adds codepoint-vs-byte mode toggle with auto-detect on reverse.
Related Tools
Convert Decimal to UTF-8 →
online decimal to UTF-8 text decoder. Byte-mode (raw UTF-8 bytes) and codepoint-mode. Client-side, instant,…
Binary to UTF-8 Decoder →
Binary to UTF-8 Text Decoder handles emoji, CJK, accents, strips BOM, counts replacement chars.…
Convert Arbitrary Base to UTF-8 →
Decode numeric tokens in any base (2-36) as UTF-8 bytes - multi-byte emoji and…
Base64 to UTF-8 Decoder →
Decode Base64 to UTF-8 text - handles emoji, CJK, BOM-stripping, URL-safe variants. Free, client-side,…
Convert Bytes to UTF-8 →
Convert Bytes to UTF-8 Decode decimal/hex/binary byte values to UTF-8 text - emoji, CJK,…
Code Points to UTF-8 Converter Free →
Free online Unicode code points to UTF-8 converter. Shows actual UTF-8 byte sequences per…
Convert Data URI to UTF-8 →
online Data URI to UTF-8 decoder with byte-breakdown panel for emoji and CJK. Client-side,…
Convert Hexadecimal to UTF-8 →
Decode hex to UTF-8 text with byte-structural breakdown. Handles ASCII, Latin, CJK, emoji. Batch…
Convert HTML Entities to UTF-8 →
Decode HTML entities to UTF-8 with per-character byte breakdown. Named, decimal, hex. Free, offline,…
Convert Octal to UTF-8 →
Decode octal byte sequences to UTF-8 text, encode UTF-8 to octal. C-escape support, multi-byte.…
Convert UTF-16 to UTF-8 →
Convert UTF-16 code units to UTF-8 text and bytes. 3 formats, BE/LE, BOM, surrogate…
Convert UTF-32 to UTF-8 →
Convert UTF-32 code points to UTF-8 text and bytes. 3 formats, BE/LE, BOM, strict…