DOC HOME SITE MAP MAN PAGES GNU INFO SEARCH PRINT BOOK
 
lex

Disambiguating rules

lex follows two rules to resolve ambiguities that may arise from the lex specification. These rules are:

A situation of the type addressed by the first rule could arise if a specification had patterns to match keywords and identifiers:
   %{
   #include "y.tab.h"
   %}
   %%
   START                   { return(STARTTOK); }
   BREAK                   { return(BREAKTOK); }
   END                     { return(ENDTOK); }
   [a-zA-Z][a-zA-Z0-9]*    { return(yytext); }
The string "START" could be matched by both the first or the fourth rule: because START is a reserved word, you want only the action associated with the first rule to be executed. By placing the rule for START and the other reserved words before the rule for identifiers, you ensure that reserved words are recognized as such.

The second kind of ambiguity could arise, for example, if the input text was coded in a programming language that had operators that were similar. Part of a lex specification for the C language might look like this:

   "+"             {return(PLUS); }
   "++"            {return(INC); }
The lexical analyzer should recognize the increment operator "++", not the addition operator "+", when it reads the following statement:
   i++;

Next topic: Context sensitivity
Previous topic: Advanced lex usage

© 2003 Caldera International, Inc. All rights reserved.
SCO OpenServer Release 5.0.7 -- 11 February 2003