Home Tools Blog About

Convert Unicode to Code Points

In short

Convert Unicode to code points (U+XXXX, HTML/CSS/JS escapes) and back. Per-character breakdown. Free, offline, client-side, instant, secure.

  • Runs in your browser
  • Nothing uploaded
  • Free, no sign-up

Convert any Unicode character to its codepoint in 7 formats: U+XXXX, decimal, HTML entity (&#x...;), CSS escape (\1F30D), JS escape (\u{1F30D}), Python escape (\U0001F30D), or a multi-column table. Reverse direction parses any of these back to text.

Per-character breakdown

Type to begin.
🛡
100% PrivateNo server uploads, ever
InstantRuns in your browser
💧
No WatermarksClean output, always
🆓
Free ForeverNo accounts, no limits

How to Use Convert Unicode to Code Points

  1. Paste your text. Anything Unicode - ASCII, CJK, emoji, math symbols. Multiple lines work as batch (one input per line → TSV output, except in table mode which is always single).
  2. Pick an output format. U+XXXX (default) - the Unicode standard's notation. Decimal - just the codepoint number. HTML entity - paste-able into HTML source (🌍 for 🌍). CSS escape - for CSS content properties (1F30D). JS escape - ES2015 curly-brace form (u{1F30D}). Python escape - 8-digit form (U0001F30D). Table - all formats side by side as TSV.
  3. Choose a separator (for non-table modes). Space (default, most readable), comma, newline (one per line), or none (continuous, for hex pasting).
  4. Read the per-character grid. One row per character: glyph, U+XXXX, decimal, HTML hex entity, and Unicode plane (ASCII / BMP / SMP / SIP). Useful for understanding how characters distribute across Unicode's 17 planes.
  5. Swap to decode. ⇄ flips to Code Points → Unicode. The decoder auto-detects format per token: U+0041 → hex codepoint, A → HTML entity decimal, A → HTML entity hex, u{41} → JS escape, 41 → CSS escape, U00000041 → Python escape, plain 65 → decimal, plain 41 with letters → hex. Mixed formats in one input work.
  6. Batch mode. Multi-line input (non-table format) → TSV with Input / Code Points / Count columns. Each line converted independently.
  7. Copy or Download. Single conversion saves as unicode-codepoints.txt; batch and table modes save as .tsv opening in spreadsheets.

Frequently Asked Questions

What’s a Unicode code point?

A unique number Unicode assigns to each character. A is U+0041 (decimal 65), é is U+00E9 (decimal 233), 🌍 is U+1F30D (decimal 127,757). Unicode has 1,114,112 code point slots (U+0000 to U+10FFFF) across 17 planes; about 150,000 are currently assigned to characters.

How do I use code points in HTML?

HTML supports two entity formats: hex &#xHHHH; and decimal &#DDDD;. For 🌍 you can write 🌍 or 🌍. Both render as 🌍. Useful when your editor doesn’t handle Unicode well, or when you want to embed characters not in your font set. Pick the HTML entity output format and the tool emits the right syntax.

How do I use code points in CSS?

CSS escapes use backslash + hex digits: content: "1F30D"; renders as 🌍. Useful in ::before and ::after pseudo-elements where you want to inject specific characters from CSS. The tool’s CSS format emits the right form. Variable hex length: CSS expects 1-6 hex digits.

How do I use code points in JavaScript?

Two forms. ES2015 curly-brace: 'u{1F30D}' – clean syntax for any codepoint up to U+10FFFF. Surrogate pair: '🌍' – required for non-BMP characters in pre-ES2015 JavaScript or for source code that needs to work without ES2015. The tool emits the curly-brace form for clarity; if you need surrogate pairs specifically, encode the high/low pair manually.

How do I use code points in Python?

Python string literals support uHHHH for BMP characters (4-digit hex) and UHHHHHHHH for non-BMP (8-digit hex). The tool emits the 8-digit form which works for any codepoint. So 🌍 becomes 'U0001F30D' in Python source. Python 3.6+ also has the N{name} form for named characters but it requires the Unicode character database.

What’s BMP, SMP, SIP, SSP?

The 17 Unicode planes are grouped. BMP (Basic Multilingual Plane, U+0000-U+FFFF) covers most living languages – Latin, Cyrillic, Greek, CJK, etc. SMP (Supplementary Multilingual Plane, U+10000-U+1FFFF) covers emoji and historic scripts. SIP (Supplementary Ideographic Plane, U+20000-U+2FFFF) covers rare CJK ideographs. SSP (Supplementary Special-purpose Plane, U+E0000+) covers tag characters and variation selectors. The grid shows which plane each character belongs to.

What’s the largest codepoint?

U+10FFFF (1,114,111 decimal). That’s the Unicode maximum – codepoints above this are not valid Unicode. The tool rejects them in reverse direction with the error “Codepoint N exceeds U+10FFFF”. The vast majority of currently-assigned codepoints are well below this (mostly in the BMP under U+FFFF).

How does the reverse direction handle mixed formats?

Tokenize by whitespace/comma, then each token is parsed independently with format auto-detection. So U+0041 B u{43} 68 decodes to ABCD (mix of U+, HTML entity, JS escape, decimal). The detected format is reported in stats. HTML entities are special-cased: any input containing &# is treated as one big HTML entity stream.

Is my text uploaded anywhere?

No. All conversion runs in your browser using codePointAt and String.fromCodePoint. Open DevTools → Network and confirm zero requests fire – even when you Convert or Download. Safe for proprietary text, secrets, or anything you’d rather not send to a third-party converter.

Does it work offline?

Yes. Total bundle is about 18 KB. Load once, disconnect, keep using. Pure JavaScript using standard string APIs – no remote dependencies. Useful for processing data on airgapped systems or in offline environments.

Keep going

Related Tools

All Unicode tools →
Share

Embed this tool

Add this free tool to your website. Copy and paste the code: