Convert Arbitrary Base to UTF-8

Decode numeric tokens in any base (2-36) as UTF-8 bytes - multi-byte emoji and non-Latin text included. Free, client-side, instant, offline.

Paste numeric byte tokens in any base from 2 to 36 and decode the whole sequence as UTF-8 — so multi-byte characters like and emoji like 🍕 render correctly.

How to Use Convert Arbitrary Base to UTF-8

  1. Paste the byte tokens into the input. Each token must fit in one byte (0-255 after conversion). Separators can be spaces, commas, or newlines - the tokenizer accepts any mix.
  2. Set the source base (2 for binary, 8 for octal, 10 for decimal, 16 for hex, up to 36). Values outside 2-36 raise an inline error and the output keeps its last valid value.
  3. Watch the live preview - the decoder repaints within 150 ms of every keystroke so you can scan the result as you type.
  4. Multi-byte characters just work - e4 b8 ad in hex decodes to , and f0 9f 8d 95 decodes to 🍕. The full byte sequence is handed to TextDecoder('utf-8') in one call, so UTF-8 continuation bytes combine correctly.
  5. Check the stats line: base, total tokens, bytes decoded, characters produced, tokens skipped as invalid for the base, and U+FFFD replacements (what TextDecoder substitutes for a malformed UTF-8 sequence).
  6. Copy writes the decoded text to the clipboard. Download .txt saves a timestamped file named utf8-decoded-base<N>-<iso>.txt. Clear wipes the input and output.
  7. Press Ctrl+Enter (⌘+Enter on Mac) to force a decode and copy the result in a single shortcut - handy after changing the base.

Frequently asked questions

Is my data secure when decoding?

Yes. The decode runs entirely in your browser – nothing is uploaded, cached, or tracked. After the page loads you can disconnect the network and the tool keeps working.

Is this decoder free?

Yes, 100% free with no cap on how much you can decode. No sign-up, no premium tier, no watermark.

Does this work offline?

Yes. HTML, CSS, and JavaScript are self-contained. Once the page has loaded, you can turn off Wi-Fi and keep decoding – ideal for air-gapped or locked-down environments.

Which bases are supported?

Every integer base from 2 to 36 inclusive. The most common are 2 (binary), 8 (octal), 10 (decimal), and 16 (hexadecimal). Base 36 uses digits 0-9 and letters a-z, giving the widest token space.

What does “UTF-8 bytes” mean here?

Each token must be a byte in the range 0-255. The whole byte sequence is then decoded as UTF-8, which means a multi-byte codepoint (like = e4 b8 ad) needs its three bytes in order – one token per byte.

How is this different from the ASCII version?

The ASCII variant treats each token as a code point (so 128512 base 10 → 😀). This tool treats each token as a byte, so emoji come from their multi-byte UTF-8 sequence instead of a single large number. Pick whichever matches your source data.

What input separators are accepted?

Any mix of spaces, tabs, commas, or newlines. The tokenizer splits on [s,]+, so CSV pastes, terminal output, and hex-dump copies all work without preprocessing.

What happens with invalid byte sequences?

Bytes that do not form a valid UTF-8 sequence are replaced with the Unicode replacement character U+FFFD (⸮). The stats line reports how many replacements were inserted, so you can spot a corrupted paste fast.

Can I decode emoji and non-Latin scripts?

Yes. UTF-8 covers the full Unicode range, including emoji, CJK characters, Devanagari, Arabic, and historic scripts. As long as your input preserves the correct byte order, the output renders natively.

How do I encode UTF-8 text back to an arbitrary base?

Use the sibling UTF-8 to Arbitrary Base converter. Both tools share the same separator rules, so the output of one pastes cleanly into the other for a round-trip check.