Bison

Structure of a yacc file:

 definitions
 \%\%
 yacc rules
 \%\%
 c code

definitions are:

 \%token or \%left 

rules are:

 seq_stmts: stmt seq_stmts { fprintf(stderr, "asdf"); } ;
 stmt: assignment ';' | somethingelse ';' ;
 assignment: V '=' expression ;
 somethingelse: /* comment means epsilon or empty word */

paramters:

 #define IF 4711
 #define ELSE 4712
 .
 .

which can be included in the definition part of the lex file:

 %{
 # include "y.tag.h"
 %}

Steps are:
1. bison -d example.y

creates y.tab.c and y.tab.h

2. flex example.lex

in example.lex you can include the header file generated in the above step 'y.tab.h'. Then it generates lex.yy.c

3. gcc y.tab.c -lfl

The file 'lex.yy.c' you have to include in your yacc/bison file. Then you can call yyparse() in the main() method. yyparse() itself calls again and again yylex() to get the next token.

token definition and start symbol

 %token token1 token2 ... tokenN
 %start lname

If you do not define a start symbol the non terminal symbol on the left side of the first rule is by default the start symbol.

priority rules and associativity

An example of a left associative operator is /:

 36/6/2
 (36/6)/2 = 3
 36/(6/2) = 12

To tell yacc/bison to take the first bracketing sceme use:

 %left '/'

To bring up priority into the rules use:

 %left '+' '-'
 %left '*' '/' '%'
 %left '^'

tells yacc/bison that *, / and % have higher priority as + and - and the same priority under themselves.

An example of a right-associative operator is '=':

 a = b = c
 a = (b = c)

In C you assign first c to variable b and then the value of this assignment will be assigned to a. This is achieved by

 %right '='

If it is the case that you do not want an associativity for an operator use

 %nonassoc '='

With this you are not allowed to write a = b = c or similiar expressions.

But what about the minus as sign character:

 -2^4 = 16

but without additional rules the parser would create -(2^4) which is -16. So we must add an additional rule

 %left '+' '-'
 %left '*' '/' '%'
 %left '^'
 %left SIGN

 expr: expr '*' expr 
     | expr '+' expr 
     | expr '^' expr 
     | '-' expr %prec SIGN ;

The %prec tells yacc to take the priority SIGN for this rule which is here the highest.

end of input stream

recognizes the parser if the next token has value 0 or negative value. Only if the sequence of tokens that the lexer returns to the parser without the end token results in an sequence that fits with the start rule in syntax analysis the parser will return 0 otherwise it will print 'syntax error' and returns with value 1.

yyparse() without a lexer file from lex

 int yylex() { return (getchar()); }
 void yyerror(char *s) { fprintf(stderr, "%s\n", s); }

 int main(int argc, char *argv[])
 {
    yyparse();
    return 0;
 }

yylval

 %uninon {
   float reel;
 }
 %token <reel> NUMBER

Here yylval does not have the type int but float.

 typedef union {
   float reel;
   char  str[30];
 } YYSTYPE;
 %token <reel> REEL
 extern YYSTYPE yylval;

 %reel <reel> expr

 expr: expr '*' expr {$$=$1+$3;}