ANSI C translation phases
The order of these translation phases is specified by ANSI C:
Every ``trigraph'' sequence in the source file is replaced.
ANSI C has exactly nine
that were invented solely as a concession to deficient character sets
(as far as C is concerned)
and are three-character sequences that name a character
not in the ISO 646-1983 character set:
??= # ??' ^
??- ~ ??! |
??( [ ??/ \
These sequences must be understood by ANSI C compilers,
but they should not be used except (possibly)
to obscure code.
The ANSI C compiler warns you whenever
it replaces a trigraph while in transition or K&R
mode, (-Xt or -Xk),
even in comments.
For example, consider the following:
/ comment ??/
/ still comment? /
The ``??/'' becomes a backslash.
This character and the following newline are removed.
The resulting characters are:
/ comment / still comment? /
The first ``/'' from the second line is the end of the comment.
The next token is the ``''.
Every backslash/new-line character pair is deleted.
The source file is converted into preprocessing tokens
and sequences of white space.
Each comment is effectively replaced by a space character.
Every preprocessing directive is handled and all macro invocations
source file is run through the earlier phases
before its contents replace the directive line.
Every escape sequence (in character constants and string literals)
Adjacent string literals are concatenated.
Every preprocessing token is converted into a (regular) token;
the compiler proper parses these and generates code.
All external object and function references are resolved,
resulting in the final program.
Old C translation phases
Tokenization and preprocessing
© 2003 Caldera International, Inc. All rights reserved.
SCO OpenServer Release 5.0.7 -- 11 February 2003