The ASCII Character Set

The American Standard Code for Information Interchange (ASCII) is probably the most ubiquitous character standard every devised. Technically called the ANSI X3.4-1967 American Standard Code for Information Interchange, this 7-bit encoding contains the most useful letters, numbers and punctuation characters for standard English use.

The following table shows the complete ASCII character set as a single table. To find the encoding of a particular character, add the hexadecimal value of that column with the hexadecimal value of that row. For example, the letter “A” is in the column that has the value 01 and in the row that has the value 40. Thus, the character code of the letter “A” is 0x01 + 0x40 = 0x41.

Please note that characters shown in this style are control characters. The control character SP (character code 0x20) is a single space.

The following table shows each character with its encoding in decimal, hexadecimal and 7-bit binary. The official character name (as given in the standards) is also given:

Dec	Hex	Bin	Character	Official Name
0	00	0000000	NUL	NULL
1	01	0000001	SOH	START OF HEADING
2	02	0000010	STX	START OF TEXT
3	03	0000011	ETX	END OF TEXT
4	04	0000100	EOT	END OF TRANSMISSION
5	05	0000101	ENQ	ENQUIRY
6	06	0000110	ACK	ACKNOWLEDGE
7	07	0000111	BEL	BELL
8	08	0001000	BS	BACKSPACE
9	09	0001001	HT	HORIZONTAL TABULATION
10	0A	0001010	LF	LINE FEED
11	0B	0001011	VT	VERTICAL TABULATION
12	0C	0001100	FF	FORM FEED
13	0D	0001101	CR	CARRIAGE RETURN
14	0E	0001110	SO	SHIFT OUT
15	0F	0001111	SI	SHIFT IN
16	10	0010000	DLE	DATA LINK ESCAPE
17	11	0010001	DC1	DEVICE CONTROL ONE
18	12	0010010	DC2	DEVICE CONTROL TWO
19	13	0010011	DC3	DEVICE CONTROL THREE
20	14	0010100	DC4	DEVICE CONTROL FOUR
21	15	0010101	NAK	NEGATIVE ACKNOWLEDGE
22	16	0010110	SYN	SYNCHRONOUS IDLE
23	17	0010111	ETB	END OF TRANSMISSION BLOCK
24	18	0011000	CAN	CANCEL
25	19	0011001	EM	END OF MEDIUM
26	1A	0011010	SUB	SUBSTITUTE
27	1B	0011011	ESC	ESCAPE
28	1C	0011100	FS	FILE SEPARATOR
29	1D	0011101	GS	GROUP SEPARATOR
30	1E	0011110	RS	RECORD SEPARATOR
31	1F	0011111	US	UNIT SEPARATOR
32	20	0100000	SP	SPACE
33	21	0100001	!	EXCLAMATION MARK
34	22	0100010	"	QUOTATION MARK
35	23	0100011	#	NUMBER SIGN
36	24	0100100	$	DOLLAR SIGN
37	25	0100101	%	PERCENT SIGN
38	26	0100110	&	AMPERSAND
39	27	0100111	'	APOSTROPHE
40	28	0101000	(	LEFT PARENTHESIS = OPENING PARENTHESIS
41	29	0101001	)	RIGHT PARENTHESIS = CLOSING PARENTHESIS
42	2A	0101010	*	ASTERISK
43	2B	0101011	+	PLUS SIGN
44	2C	0101100	,	COMMA
45	2D	0101101	-	HYPHEN-MINUS
46	2E	0101110	.	FULL STOP = PERIOD
47	2F	0101111	/	SOLIDUS = SLASH
48	30	0110000	0	DIGIT ZERO
49	31	0110001	1	DIGIT ONE
50	32	0110010	2	DIGIT TWO
51	33	0110011	3	DIGIT THREE
52	34	0110100	4	DIGIT FOUR
53	35	0110101	5	DIGIT FIVE
54	36	0110110	6	DIGIT SIX
55	37	0110111	7	DIGIT SEVEN
56	38	0111000	8	DIGIT EIGHT
57	39	0111001	9	DIGIT NINE
58	3A	0111010	:	COLON
59	3B	0111011	;	SEMICOLON
60	3C	0111100	<	LESS-THAN SIGN
61	3D	0111101	=	EQUALS SIGN
62	3E	0111110	>	GREATER-THAN SIGN
63	3F	0111111	?	QUESTION MARK
64	40	1000000	@	COMMERCIAL AT
65	41	1000001	A	LATIN CAPITAL LETTER A
66	42	1000010	B	LATIN CAPITAL LETTER B
67	43	1000011	C	LATIN CAPITAL LETTER C
68	44	1000100	D	LATIN CAPITAL LETTER D
69	45	1000101	E	LATIN CAPITAL LETTER E
70	46	1000110	F	LATIN CAPITAL LETTER F
71	47	1000111	G	LATIN CAPITAL LETTER G
72	48	1001000	H	LATIN CAPITAL LETTER H
73	49	1001001	I	LATIN CAPITAL LETTER I
74	4A	1001010	J	LATIN CAPITAL LETTER J
75	4B	1001011	K	LATIN CAPITAL LETTER K
76	4C	1001100	L	LATIN CAPITAL LETTER L
77	4D	1001101	M	LATIN CAPITAL LETTER M
78	4E	1001110	N	LATIN CAPITAL LETTER N
79	4F	1001111	O	LATIN CAPITAL LETTER O
80	50	1010000	P	LATIN CAPITAL LETTER P
81	51	1010001	Q	LATIN CAPITAL LETTER Q
82	52	1010010	R	LATIN CAPITAL LETTER R
83	53	1010011	S	LATIN CAPITAL LETTER S
84	54	1010100	T	LATIN CAPITAL LETTER T
85	55	1010101	U	LATIN CAPITAL LETTER U
86	56	1010110	V	LATIN CAPITAL LETTER V
87	57	1010111	W	LATIN CAPITAL LETTER W
88	58	1011000	X	LATIN CAPITAL LETTER X
89	59	1011001	Y	LATIN CAPITAL LETTER Y
90	5A	1011010	Z	LATIN CAPITAL LETTER Z
91	5B	1011011	[	LEFT SQUARE BRACKET = OPENING SQUARE BRACKET
92	5C	1011100	\	REVERSE SOLIDUS = BACKSLASH
93	5D	1011101	]	RIGHT SQUARE BRACKET = CLOSING SQUARE BRACKET
94	5E	1011110	^	CIRCUMFLEX ACCENT
95	5F	1011111	_	LOW LINE = SPACING UNDERSCORE
96	60	1100000	`	GRAVE ACCENT
97	61	1100001	a	LATIN SMALL LETTER A
98	62	1100010	b	LATIN SMALL LETTER B
99	63	1100011	c	LATIN SMALL LETTER C
100	64	1100100	d	LATIN SMALL LETTER D
101	65	1100101	e	LATIN SMALL LETTER E
102	66	1100110	f	LATIN SMALL LETTER F
103	67	1100111	g	LATIN SMALL LETTER G
104	68	1101000	h	LATIN SMALL LETTER H
105	69	1101001	i	LATIN SMALL LETTER I
106	6A	1101010	j	LATIN SMALL LETTER J
107	6B	1101011	k	LATIN SMALL LETTER K
108	6C	1101100	l	LATIN SMALL LETTER L
109	6D	1101101	m	LATIN SMALL LETTER M
110	6E	1101110	n	LATIN SMALL LETTER N
111	6F	1101111	o	LATIN SMALL LETTER O
112	70	1110000	p	LATIN SMALL LETTER P
113	71	1110001	q	LATIN SMALL LETTER Q
114	72	1110010	r	LATIN SMALL LETTER R
115	73	1110011	s	LATIN SMALL LETTER S
116	74	1110100	t	LATIN SMALL LETTER T
117	75	1110101	u	LATIN SMALL LETTER U
118	76	1110110	v	LATIN SMALL LETTER V
119	77	1110111	w	LATIN SMALL LETTER W
120	78	1111000	x	LATIN SMALL LETTER X
121	79	1111001	y	LATIN SMALL LETTER Y
122	7A	1111010	z	LATIN SMALL LETTER Z
123	7B	1111011	{	LEFT CURLY BRACKET = OPENING CURLY BRACKET
124	7C	1111100	\|	VERTICAL LINE = VERTICAL BAR
125	7D	1111101	}	RIGHT CURLY BRACKET = CLOSING CURLY BRACKET
126	7E	1111110	~	TILDE
127	7F	1111111	DEL	DELETE

The main problem with the ASCII character encoding is that it is English-centric: it does not contain sufficient characters for any other language, a fact that has caused much concern over the years. This problem is remedied in the so-called Universal Character Encoding, Unicode. Unicode is a 32-bit character encoding that is designed to cover every language and character on Earth; this ambitious goal has been formalised as an International Standard, ISO/IEC 10646. Please refer to the Unicode Web page for more information on this standard.

You can consult the actual standard (available on this CD-ROM as the

ECMA-6/ISO-646 7-bit Coded Character Set) if you want to see how real-world standards are written. The ASCII character set is equivalent to the

C0 Controls and Basic Latin section of the Unicode Standard.

If you are wondering how the ASCII character set came to be the way it is, and why it is still used almost 40 years after its inception, you might want to read Tom Jenning’s excellent history of the ASCII character set. Highly recommended!