A Unicode code point is the number assigned to a character, written as U+ followed by hex, so A is U+0041 and a smiley is U+1F600. A Unicode escape is that code point written in a form code can read, such as A in many languages. This guide explains code points, the common escape formats, and free tools to convert text to code points or escapes and back.
In this guide
What a code point is
Unicode gives every character a unique number, its code point, independent of how it is stored in bytes. The notation U+ followed by hex is the standard way to write it. Code points run from U+0000 well past U+10FFFF, covering every script, symbol, and emoji. The relationship between code points and bytes is covered in our text encoding guide.
What an escape is
An escape is a code point written so a program reads it as a character rather than literal text. JavaScript and JSON use A, with a braced form u{1F600} for larger values. HTML uses a different style, the numeric entity, covered in our companion guide. Escapes let you put any character into source code or data using only plain ASCII, which avoids encoding problems in files and transports.
Convert text to code points
To list the code points of a string, you read each character and note its U+ number. The Unicode to code points converter does this for a whole string, and the Unicode escape tool produces the u form ready to paste into code. Both are useful when a character looks right on screen but you need to know exactly what it is.
Code points back to text
Going the other way, the code points to Unicode converter turns a list of U+ numbers back into the characters they name. This is how you rebuild a string from a specification, or check that an escape sequence produces the character you expected before pasting it into a project.
When you need them
Code points and escapes solve real problems. They let you include a character your keyboard cannot type, embed an emoji safely in JSON, identify an invisible or look-alike character, and write test data that survives any encoding. Anyone debugging a mysterious character or building internationalized software reaches for them regularly, because they pin down exactly which character is in play.
Free tools used in this guide
Frequently asked questions
What is a Unicode code point?
The unique number Unicode assigns to a character, written as U+ followed by hex, such as U+0041 for A, independent of how it is stored.
What does A mean?
It is a Unicode escape for the code point U+0041, which is the letter A, written so a program reads it as that character.
How do I write an emoji as an escape?
Use the braced form for large code points, such as u{1F600}, since the basic four-digit u form only covers values up to U+FFFF.
Why use escapes instead of the character itself?
To include characters your keyboard cannot type and to avoid encoding problems, since escapes use only plain ASCII.
How do I find a character’s code point?
Run the text through a Unicode to code points converter, which lists the U+ number for each character.