1.2 Strings in Unicode

All strings in a resource file are now stored in Unicode format. In this format, all characters are represented by a 16-bit (WORD) value. The first 256 characters are identical to the 256 characters in the Windows ANSI character set (although the characters are represented by 16 bits each rather than 8 bits). This means that they are terminated with a UNICODE_NULL symbol rather than a single NULL. The resource compiler translates all normal ASCII strings into Unicode by calling the MultiByteToWideChar function provided by the Windows API. All escaped characters are stored directly, and are assumed to be valid Unicode characters for the resource. If these strings are read in later by an application as ASCII (for instance, by calling the LoadString API), they will be converted back from Unicode to ASCII transparently by the loader.

The only exception to the rule is strings in RCDATA statements. These pseudo-strings are not real strings, but merely a convenient notation for a collection of bytes. Users may overlay a structure over the data from an RCDATA statement and expect certain data to be at certain offsets. If a pseudo-string gets automatically changed into a Unicode string, it will inadvertently change the offsets of things in the structure and break those applications. Hence, these pseudo-strings must be left as ASCII bytes. To specify a Unicode string inside an RCDATA statement, the user should use the explicit L-quoted string.