Convert UTF-8 to Arbitrary Base

In short

Convert UTF-8 text bytes into any base 2-36 (binary, octal, hex, custom). Bidirectional, prefix, padding, case. Free, client-side, instant, secure.

Runs in your browser
Nothing uploaded
Free, no sign-up

Encode each UTF-8 byte of your text in any base from 2 to 36. Use binary (base 2), octal (base 8), hex (base 16), or any radix in between - useful for protocol debugging, education, and unusual encoding schemes. Swap to decode tokens back to text.

UTF-8 text input

Base (2-36)

Separator

Prefix

Pad to byte width Uppercase letters

Encoded bytes

Per-character breakdown

Type to begin.

🛡

100% PrivateNo server uploads, ever

⚡

InstantRuns in your browser

💧

No WatermarksClean output, always

🆓

Free ForeverNo accounts, no limits

How to Use Convert UTF-8 to Arbitrary Base

Paste UTF-8 text. The tool runs TextEncoder to get the raw byte sequence.
Pick a base from 2 to 36. Padding ensures every byte uses the same width (8 chars for base 2, 2 for base 16, 2 for base 36, etc.), so reverse decoding has no ambiguity.
Choose case (uppercase A-F by default for bases > 10) and prefix (none, or common literals: 0b for 2, 0o for 8, 0x for 16).
The per-character grid shows each input character's codepoint, UTF-8 bytes in hex, and the same bytes in your chosen base - useful for verifying multi-byte characters.
Swap direction to decode: paste base tokens, get UTF-8 text back. The decoder uses TextDecoder('utf-8', {fatal: true}) so corrupt sequences throw rather than silently producing replacement characters.

Frequently Asked Questions

What does “arbitrary base” mean here?

Each individual UTF-8 byte (value 0-255) gets written in your chosen number base. Base 16 produces hex bytes (00-FF). Base 2 produces 8-bit binary. Base 36 packs each byte into 2 chars using digits 0-9 and letters A-Z. The tool is NOT changing the underlying bytes – only how they’re written.

Why is the base limited to 2-36?

JavaScript’s Number.prototype.toString(radix) supports 2-36 because that’s the alphabet of 0-9 (10 digits) + A-Z (26 letters). Above 36 you’d need to define your own alphabet (Base58, Base64, Base85 all use distinct character sets – those are separate tools).

How does padding work?

A byte’s max value (255) needs different widths in different bases: 8 digits in binary (255 = 11111111), 3 in octal (377), 2 in hex (FF), 2 in base 36 (73). Padding fills shorter byte representations with leading zeros so reverse decoding can split fixed-width tokens unambiguously. With padding off, byte 5 in base 2 is just 101 (ambiguous with byte 0x101 = 257 which can’t exist) and reverse needs explicit separators.

What about multi-byte UTF-8 characters?

Encoded byte-by-byte exactly as UTF-8 produces them. 🌍 (U+1F30D) is 4 UTF-8 bytes F0 9F 8C 8D in hex; in base 2 that’s four 8-bit tokens; in base 36 it’s 6Y 4F 4D 4D. The per-character grid shows the multi-byte expansion.

Does the reverse decoder validate UTF-8?

Yes – strictly. The tool runs the parsed bytes through TextDecoder('utf-8', {fatal: true}), so invalid sequences (truncated multi-byte chars, overlong encodings, bytes encoding surrogate codepoints) throw an explicit error rather than substituting U+FFFD replacement characters. If decoding fails, the message usually means you have the wrong base or corrupted input.

Why might decoding produce “byte value exceeds 255”?

Each token must be a single byte (0-255). If your token parses to a larger value in the chosen base – say FFF in hex = 4095, well above 255 – that’s a clue the input was meant for a different base, or wasn’t padded properly, so neighbouring tokens got merged.

What’s a good use case for non-standard bases like 7 or 23?

Mostly education (showing how base conversion generalises) or constraint games (encoding data in a system that only allows certain characters). Base 36 specifically packs bytes more compactly than hex – 2 chars per byte at most, same as hex, but with 26 extra symbols in the alphabet.

Is base 64 supported?

No – Base64 isn’t a mathematical radix; it’s a specific encoding spec (3 input bytes → 4 output chars from a 64-char alphabet including + / or URL-safe - _). For Base64, use the dedicated converter in this suite.

Is my text uploaded?

No. TextEncoder and TextDecoder run entirely in the browser. About 18 KB of code.

What’s the input cap?

200,000 characters. Tokens grow with smaller bases (base 2 produces 8× the chars), so this cap protects against tab-freezes on huge inputs.

Keep going

Related Tools

All Utf8 tools →

Embed this tool

Add this free tool to your website. Copy and paste the code:

<iframe src="https://alltoolsverse.com/tools/convert-utf8-to-arbitrary-base/?embed=1" width="100%" height="760" loading="lazy" style="max-width:900px;border:1px solid #e2e8f0;border-radius:12px" title="Convert UTF-8 to Arbitrary Base"></iframe>
<p>Free tool: <a href="https://alltoolsverse.com/tools/convert-utf8-to-arbitrary-base/">Convert UTF-8 to Arbitrary Base</a> by All Tools Verse</p>

Per-character breakdown

Related Tools

Convert Arbitrary Base to UTF-8 →

Binary to UTF-8 Decoder →

Base64 to UTF-8 Decoder →

Convert Bytes to UTF-8 →

Code Points to UTF-8 Converter Free →

Convert Data URI to UTF-8 →

Convert Decimal to UTF-8 →

Convert Hexadecimal to UTF-8 →

Convert HTML Entities to UTF-8 →

Convert Octal to UTF-8 →

Convert UTF-16 to UTF-8 →

Convert UTF-32 to UTF-8 →

Embed this tool