INFO: String Constants May Be Interpreted as Trigraphs in C

Last reviewed: September 8, 1997
Article ID: Q67082

The information in this article applies to:
  • Microsoft C for MS-DOS, versions 6.0, 6.0a, 6.0ax
  • Microsoft C for OS/2, versions 6.0, 6.0a
  • Microsoft C/C++ for MS-DOS, version 7.0
  • Microsoft Visual C++ for Windows, versions 1.0, 1.5
  • Microsoft Visual C++ 32-bit Edition, versions 1.0, 2.0, 4.0, 5.0

SUMMARY

To maintain compatibility with other systems, a series of ANSI- mandated trigraphs have been implemented beginning with Microsoft C version 6.0 and Microsoft QuickC version 2.5. The addition of these trigraphs may require that changes be made to code that was previously written for other versions of C that do not support these codes. The sample code below illustrates one such instance where this is necessary. The trigraphs are listed on page 424 of the Microsoft C "Advanced Programming Techniques" version 6.0 manual.

MORE INFORMATION

Trigraphs are three-character combinations that are used to represent certain symbols in the C language that are not available in all character sets. For example, some keyboards or character sets do not have the opening and closing brace characters, "{" and "}". These characters are essential to writing a C program; therefore, someone without use of these characters can use the trigraphs "??<" and "??>" in place of the braces.

The compiler translates the three-character trigraph combinations into single characters at compile time. If a sequence of characters in a constant string matches a trigraph pattern, the compiler will replace the three characters with the single corresponding character that the trigraph represents.

This situation may manifest itself when using functions, such as _dos_findfirst(), that may use these characters in a constant to represent wildcard characters when doing a file search. The workaround is to break up the constant with double quotation marks, as shown below. This procedure will cause the compiler to concatenate the two strings without first translating the characters.

Sample Code

   #include <stdio.h>

   void main( void)
   {
      /* '??-' in the following line will be replaced by a '~' */

      printf( "??-Hello\n" );
   }

To prevent the compiler from misinterpreting the "??-" character sequence as an unintended trigraph, you could replace the printf line above with the following line:

   printf( "??""-Hello\n" );

Notice that the only difference is the double quotation marks used to break up the string into two substrings, thus eliminating the trigraph pattern.
Keywords          : CLngIss
Version           : MS-DOS:6.0,6.00a,6.00ax,7.0; OS/2:6.0,6.00a;  WINDOWS:1.0,1.5; WINDOWS NT:1.0,2.0,4.0,5.0
Platform          : MS-DOS NT OS/2 WINDOWS


================================================================================


THE INFORMATION PROVIDED IN THE MICROSOFT KNOWLEDGE BASE IS PROVIDED "AS IS" WITHOUT WARRANTY OF ANY KIND. MICROSOFT DISCLAIMS ALL WARRANTIES, EITHER EXPRESS OR IMPLIED, INCLUDING THE WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. IN NO EVENT SHALL MICROSOFT CORPORATION OR ITS SUPPLIERS BE LIABLE FOR ANY DAMAGES WHATSOEVER INCLUDING DIRECT, INDIRECT, INCIDENTAL, CONSEQUENTIAL, LOSS OF BUSINESS PROFITS OR SPECIAL DAMAGES, EVEN IF MICROSOFT CORPORATION OR ITS SUPPLIERS HAVE BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. SOME STATES DO NOT ALLOW THE EXCLUSION OR LIMITATION OF LIABILITY FOR CONSEQUENTIAL OR INCIDENTAL DAMAGES SO THE FOREGOING LIMITATION MAY NOT APPLY.

Last reviewed: September 8, 1997
© 1998 Microsoft Corporation. All rights reserved. Terms of Use.