Escape Sequences and Control Characters

Escape Sequences and Control Characters

Translate escape sequences character by character into Unicode. When an ASCII plain text file is converted to Unicode, there is a chance that it will subsequently be converted back to ASCII. Converting escape sequences into Unicode on a character-by-character basis, rather than as a single 2-byte characters makes it possible to perform the reverse conversion without recognizing and parsing the escape sequences as such. For example, ESC+A should become 0x001B (ESC), 0x0041 (A), rather than 0x411B.

The first 32 sixteen-bit characters in Unicode are intended for the 32 control characters. This approach supports the existing use of control characters for formatting purposes ¾ that is, Unicode applications can treat these control characters in exactly the same way as they treat their ASCII equivalents.