|
|
The declarations section is used to declare and describe constructs that are needed by the parsing mechanism and the actions associated with rules in the rules section.
Any token that appears in a rule in the rules section must be declared. There are several ways of doing this: the most common way is by using a %token statement. This has the form:
%token name1 name2 ...Each name that appears after the keyword %token is thereby declared as a token. You can declare several tokens on the same statement, and have several such statements. Tokens can be declared in a similar way using the %left, %right and %nonassoc keywords, discussed in ``Precedence''.
Every name that appears in the rules section, but is not defined in the declarations section, is assumed to represent a non-terminal symbol. Every non-terminal symbol must appear on the left side of at least one rule.
The start symbol is the top-level non-terminal symbol in the grammar. By default, the start symbol is the left-hand side of the first grammar rule in the rules section. You can declare a start symbol explicitly in the declarations section, using the %start keyword:
%start somename
C code to be used by the parser can appear in the declarations section, enclosed between the delimiters `%{' and `%}'. Declarations made here have global scope, so they are known to the action statements and can be made known to the lexical analyzer. This section is usually used for variable declarations and #include statements, though other C code can appear here, as shown in the following example:
%{ #include "global.h" int ival = 0; %}Names beginning with yy should be avoided, because internal variables used by the parser begin with these characters.
By default, the values that parser actions associate with symbols are integers. yacc can also support values of other types, including structures. You can declare a C union that holds the different kinds of values that symbols can have. The parser maintains a data structure called value stack that is declared to be of this union type. To declare the union, use a declaration in the following form:
%union { ... body of union ... };For example:
%union { char *text; int ival; double dub; };In addition to the value stack, the external variables yylval and yyval are declared to have type equal to this union. If yacc is invoked with the -d option, the union declaration is defined under the name YYSTYPE in the file y.tab.h.
Once YYSTYPE is defined, the union member names must be associated with the various terminal and non-terminal names. This enables yacc to automatically associate the right type with the pseudo-variables used in actions so that the resulting parser is type-checked. For non-terminal symbols, this association is done using the %type keyword. The following declarations associate symbols with the members of the union in the example above:
%token <text> s1 s2 %token <ival> s3 %token <dub> s4To associate a terminal symbol (token) with a union member name, the %token keyword is normally used. The following declaration associates the tokens s5 and s6 with the union member ival.
%token <ival> s5 s6In some cases, these mechanisms are insufficient. For example, there is no default type for the value returned by an action that occurs in the middle of a rule. Similarly, yacc must be told explicitly about the type of left-context values such as $0. In such cases, a type can be imposed by inserting a union member name between angle brackets, `<' and `>', immediately after the first ``$'' in a pseudo-variable. The following example shows this usage.
stat : A { $<intval>$ = 3; } B { fun( $<intval>2, $<other>0 ); } ;
The keywords %left, %right, and %nonassoc can replace %token in the preceding examples. However, these keywords are used principally to deal with operator precedence and associativity. An understanding of precedence and associativity relies heavily on the discussions which follow, and hence consideration of these topics is delayed until ``Precedence''.