Home Tools Blog About

Convert Octal to UTF-8

In short

Decode octal byte sequences to UTF-8 text, encode UTF-8 to octal. C-escape support, multi-byte. Free, offline.

  • Runs in your browser
  • Nothing uploaded
  • Free, no sign-up

Decode octal byte sequences to UTF-8 text - handles multi-byte sequences for accented Latin (303 251 = é), CJK, and emoji (360 237 230 200 = 😀). Accepts bare, C-escape (\110\145), or comma-separated input. Reverse direction encodes UTF-8 back to octal.

Enter input to convert.
🛡
100% PrivateNo server uploads, ever
InstantRuns in your browser
💧
No WatermarksClean output, always
🆓
Free ForeverNo accounts, no limits

How to Use Convert Octal to UTF-8

  1. Pick direction - Octal → UTF-8 (default decode) or UTF-8 → Octal. Swap ⇄ flips and pre-fills the previous output as the new input.
  2. Paste your octal byte values. Accepts bare (110 145 154 154 157), C-escape (110145154154157), or comma-separated (110, 145, 154, 154, 157) - even mixed. Default 110 145 154 154 157 decodes to Hello.
  3. Pick invalid-UTF8 mode. Replace (default) substitutes for malformed sequences - most forgiving. Strict errors out on any invalid byte. Keep treats each byte as Latin-1 (one char per byte) for raw recovery.
  4. Read the per-byte breakdown table. For each byte: octal (with backslash for clarity), decimal, hex, binary (8-bit), the rendered character, and the Unicode codepoint (U+XXXX). Continuation bytes of multi-byte UTF-8 sequences are dimmed so you can see which bytes form which character.
  5. Multi-byte UTF-8 handling. 303 251 (2 bytes) = é (U+00E9). 340 244 270 (3 bytes) = ह (Devanagari letter ha). 360 237 230 200 (4 bytes) = 😀. Look at the breakdown - the lead byte (≥ C0) starts the character; the continuation bytes (80-BF) finish it.
  6. Reverse direction. Type "Hello" → output 110 145 154 154 157. Or pick C-escape style to get 110145154154157 for pasting into C/Python source.
  7. Copy or Download. Copy puts the decoded/encoded text on your clipboard. Download saves decoded.txt (decode) or octal.txt (encode).

Frequently Asked Questions

What’s the range of valid octal byte values?

0 through 0377 (decimal 0-255) – one byte. Each octal value in the input represents one UTF-8 byte. For ASCII characters (decimal 0-127), one byte = one character. For multi-byte characters (accented Latin, CJK, emoji), two or more consecutive bytes combine to form one Unicode character; the tool’s TextDecoder handles the assembly automatically.

How are multi-byte UTF-8 characters encoded in octal?

UTF-8 uses a self-synchronizing scheme: the lead byte’s value tells you how many continuation bytes follow. Bytes 0xC0-0xDF (octal 300-337) lead a 2-byte sequence; 0xE0-0xEF (octal 340-357) lead 3-byte; 0xF0-0xF7 (octal 360-367) lead 4-byte. Continuation bytes are always in the 0x80-0xBF (octal 200-277) range. The decoder spots them automatically – you just paste the octal bytes in order.

What’s the difference between Replace and Strict invalid-UTF8 modes?

Strict mode (using TextDecoder({fatal: true})) throws an error on any byte sequence that isn’t valid UTF-8 – useful for verifying your input is clean. Replace mode substitutes (U+FFFD, the Unicode replacement character) for each invalid sequence – useful for graceful recovery from corrupted data. Keep mode bypasses UTF-8 decoding entirely and treats each byte as its Latin-1 character – useful when your data is actually Latin-1 not UTF-8.

Why does C-escape style use backslashes?

It matches the literal escape sequence syntax in C, C++, Python, and most languages descended from C. In source code you’d write a string like "303251" and the compiler/interpreter reads those three-digit octal escapes as raw bytes. Paste your decoded result into a string literal and the compiler reproduces the original characters. Same goes the other way for round-tripping.

What’s a continuation byte and why are they dimmed?

UTF-8 multi-byte sequences have one “lead” byte (range C0-FD) followed by 1-3 “continuation” bytes (range 80-BF). The lead byte declares “I’m starting a 2/3/4-byte character”. The continuation bytes contribute the rest of the codepoint’s bits. Visually we dim the continuations to make it obvious which bytes form which character – useful for spotting where a multi-byte sequence starts and ends.

Can I decode bytes that aren’t valid UTF-8?

Sort of. In Strict mode, no – invalid bytes throw an error. In Replace mode, invalid sequences become but valid surrounding text still decodes fine. In Keep mode (Latin-1), every byte becomes some character – even invalid UTF-8 bytes get a rendering (often a control character or accented Latin glyph). Pick Keep when your “octal” is actually a non-UTF-8 byte stream you want to inspect byte-by-byte.

How does the encode (UTF-8 → Octal) direction work?

Uses the browser’s TextEncoder API which always emits UTF-8 bytes. Each byte then gets formatted as a 3-digit octal (always padded to 3 digits, even for ASCII). So “Hello” becomes 110 145 154 154 157 (5 bytes). For non-ASCII: “é” becomes 303 251 (2 bytes); “😀” becomes 360 237 230 200 (4 bytes).

Where does octal-encoded UTF-8 actually appear?

C/C++/Python string literals ("303251"), some older Unix protocol dumps, octal-formatted register exports from embedded systems, and certain database export formats. The convention pre-dates hex’s dominance in modern systems but persists in C-family ecosystems for backwards compatibility.

Is my data uploaded?

No. All parsing, byte arithmetic, and UTF-8 decoding/encoding runs in your browser using the built-in TextEncoder/TextDecoder APIs. Open DevTools → Network and confirm zero requests fire after the page loads.

Does it work offline?

Yes. Total bundle is under 22 KB. Once loaded, disconnect and keep converting. Useful for sysadmin work or legacy data recovery on air-gapped boxes.

Keep going

Related Tools

All Utf8 tools →

Convert UTF-8 to Octal

Convert UTF-8 text to octal bytes (4 prefix, 3 padding, 4 separator options). Bidirectional.…

Binary to UTF-8 Decoder

Binary to UTF-8 Text Decoder handles emoji, CJK, accents, strips BOM, counts replacement chars.…

Convert Arbitrary Base to UTF-8

Decode numeric tokens in any base (2-36) as UTF-8 bytes - multi-byte emoji and…

Base64 to UTF-8 Decoder

Decode Base64 to UTF-8 text - handles emoji, CJK, BOM-stripping, URL-safe variants. Free, client-side,…

Convert Bytes to UTF-8

Convert Bytes to UTF-8 Decode decimal/hex/binary byte values to UTF-8 text - emoji, CJK,…

Code Points to UTF-8 Converter Free

Free online Unicode code points to UTF-8 converter. Shows actual UTF-8 byte sequences per…

Convert Data URI to UTF-8

online Data URI to UTF-8 decoder with byte-breakdown panel for emoji and CJK. Client-side,…

Convert Decimal to UTF-8

online decimal to UTF-8 text decoder. Byte-mode (raw UTF-8 bytes) and codepoint-mode. Client-side, instant,…

Convert Hexadecimal to UTF-8

Decode hex to UTF-8 text with byte-structural breakdown. Handles ASCII, Latin, CJK, emoji. Batch…

Convert HTML Entities to UTF-8

Decode HTML entities to UTF-8 with per-character byte breakdown. Named, decimal, hex. Free, offline,…

Convert UTF-16 to UTF-8

Convert UTF-16 code units to UTF-8 text and bytes. 3 formats, BE/LE, BOM, surrogate…

Convert UTF-32 to UTF-8

Convert UTF-32 code points to UTF-8 text and bytes. 3 formats, BE/LE, BOM, strict…

Share

Embed this tool

Add this free tool to your website. Copy and paste the code: