Home Tools Blog About

Convert Unicode to UTF-16

In short

Convert Unicode to UTF-16 code units (hex/decimal/binary) with endianness, BOM, surrogate pair handling. Reverse too. Free, offline, client-side, secure.

  • Runs in your browser
  • Nothing uploaded
  • Free, no sign-up

Convert Unicode text to UTF-16 code units with endianness (BE / LE), optional BOM, and 3 output formats (hex / decimal / 16-bit binary). Surrogate pairs for non-BMP characters (emoji) are visible in the per-character grid. Reverse direction auto-detects format and combines surrogate pairs.

Per-character breakdown

Type to begin.
🛡
100% PrivateNo server uploads, ever
InstantRuns in your browser
💧
No WatermarksClean output, always
🆓
Free ForeverNo accounts, no limits

How to Use Convert Unicode to UTF-16

  1. Paste text. Anything Unicode.
  2. Pick format. Hex (4-digit per unit), decimal, binary (16-bit).
  3. Choose endianness. BE is JavaScript/Java internal order. LE is what x86 systems write natively.
  4. Toggle BOM. Prepends U+FEFF for explicit byte-order indication.
  5. Read the grid. Surrogate pairs highlighted for non-BMP characters (emoji etc.).
  6. Swap to decode. Auto-detects format. BOM stripped if present.

Frequently Asked Questions

What’s UTF-16?

Unicode Transformation Format 16-bit. Each code unit is 16 bits. BMP characters use one unit. Supplementary characters (codepoints > U+FFFF, including emoji) use a surrogate pair – two units in the ranges U+D800-U+DBFF (high) and U+DC00-U+DFFF (low).

Why BE vs LE?

UTF-16 stores 16-bit values. When serialized to bytes, byte order matters. BE puts high byte first (network order); LE puts low byte first (x86 native).

What’s a surrogate pair?

UTF-16’s encoding for codepoints above U+FFFF. 🌍 (U+1F30D) becomes D83C + DF0D.

What’s a BOM?

Byte Order Mark – U+FEFF at start. UTF-16BE serializes as FE FF; UTF-16LE as FF FE. Decoders use this to determine endianness.

How does decode handle lone surrogates?

Just concatenates 16-bit units. JS allows lone surrogates (ill-formed UTF-16) so they’re preserved.

UTF-16 vs UTF-8?

UTF-16 is internal for JS/Java/.NET/Windows. UTF-8 is the web’s wire format. UTF-8 wins for ASCII; UTF-16 wins for CJK.

How is reverse detected?

First token: 16 binary digits → binary; hex chars → hex; pure digits → decimal.

Max code unit?

0xFFFF (65535), the largest value a single UTF-16 code unit can hold. Characters above U+FFFF, like most emoji, are represented as surrogate pairs – two code units working together.

Text uploaded?

No. Everything runs in your browser with JavaScript – nothing is sent to a server, logged, or stored, and the tool keeps working offline once the page has loaded.

Offline?

Yes. The whole tool weighs about 16 KB, so once the page has loaded it runs without any network connection – every conversion happens locally in JavaScript on your device.

Keep going

Related Tools

All Unicode tools →
Share

Embed this tool

Add this free tool to your website. Copy and paste the code: