Convert Arbitrary Base to UTF-8
Decode numeric tokens in any base (2-36) as UTF-8 bytes - multi-byte emoji and non-Latin text included. Free, client-side, instant, offline.
Paste numeric byte tokens in any base from 2 to 36 and decode the whole sequence as UTF-8 — so multi-byte characters like 中 and emoji like 🍕 render correctly.
How to Use Convert Arbitrary Base to UTF-8
- Paste the byte tokens into the input. Each token must fit in one byte (0-255 after conversion). Separators can be spaces, commas, or newlines - the tokenizer accepts any mix.
- Set the source base (2 for binary, 8 for octal, 10 for decimal, 16 for hex, up to 36). Values outside 2-36 raise an inline error and the output keeps its last valid value.
- Watch the live preview - the decoder repaints within 150 ms of every keystroke so you can scan the result as you type.
- Multi-byte characters just work -
e4 b8 adin hex decodes to中, andf0 9f 8d 95decodes to 🍕. The full byte sequence is handed toTextDecoder('utf-8')in one call, so UTF-8 continuation bytes combine correctly. - Check the stats line: base, total tokens, bytes decoded, characters produced, tokens skipped as invalid for the base, and U+FFFD replacements (what TextDecoder substitutes for a malformed UTF-8 sequence).
- Copy writes the decoded text to the clipboard. Download .txt saves a timestamped file named
utf8-decoded-base<N>-<iso>.txt. Clear wipes the input and output. - Press
Ctrl+Enter(⌘+Enteron Mac) to force a decode and copy the result in a single shortcut - handy after changing the base.
Frequently asked questions
Is my data secure when decoding?
Yes. The decode runs entirely in your browser – nothing is uploaded, cached, or tracked. After the page loads you can disconnect the network and the tool keeps working.
Is this decoder free?
Yes, 100% free with no cap on how much you can decode. No sign-up, no premium tier, no watermark.
Does this work offline?
Yes. HTML, CSS, and JavaScript are self-contained. Once the page has loaded, you can turn off Wi-Fi and keep decoding – ideal for air-gapped or locked-down environments.
Which bases are supported?
Every integer base from 2 to 36 inclusive. The most common are 2 (binary), 8 (octal), 10 (decimal), and 16 (hexadecimal). Base 36 uses digits 0-9 and letters a-z, giving the widest token space.
What does “UTF-8 bytes” mean here?
Each token must be a byte in the range 0-255. The whole byte sequence is then decoded as UTF-8, which means a multi-byte codepoint (like 中 = e4 b8 ad) needs its three bytes in order – one token per byte.
How is this different from the ASCII version?
The ASCII variant treats each token as a code point (so 128512 base 10 → 😀). This tool treats each token as a byte, so emoji come from their multi-byte UTF-8 sequence instead of a single large number. Pick whichever matches your source data.
What input separators are accepted?
Any mix of spaces, tabs, commas, or newlines. The tokenizer splits on [s,]+, so CSV pastes, terminal output, and hex-dump copies all work without preprocessing.
What happens with invalid byte sequences?
Bytes that do not form a valid UTF-8 sequence are replaced with the Unicode replacement character U+FFFD (⸮). The stats line reports how many replacements were inserted, so you can spot a corrupted paste fast.
Can I decode emoji and non-Latin scripts?
Yes. UTF-8 covers the full Unicode range, including emoji, CJK characters, Devanagari, Arabic, and historic scripts. As long as your input preserves the correct byte order, the output renders natively.
How do I encode UTF-8 text back to an arbitrary base?
Use the sibling UTF-8 to Arbitrary Base converter. Both tools share the same separator rules, so the output of one pastes cleanly into the other for a round-trip check.