MultiSource/Applications/treecc/doc/treecc.texi - third_party/github.com/llvm/llvm-test-suite - Git at Google

 \input texinfo	@c -*-texinfo-*-
 @c %** start of header
 @setfilename treecc.info
 @settitle Tree Compiler-Compiler
 @setchapternewpage off
 @c %** end of header

 @dircategory DotGNU
 @direntry
 * TreeCC: (treecc).             Generate code for compilers to build
                                 abstract syntax trees.
 @end direntry

 @ifinfo
 The treecc program converts descriptions of abstract syntax
 trees into source code that can be used to support compiler
 development.

 @noindent
 Copyright @copyright{} 2001 Southern Storm Software, Pty Ltd
 @*Copyright @copyright{} 2003 Free Software Foundation, Inc.
 @end ifinfo

 @titlepage
 @sp 10
 @center @titlefont{Tree Compiler-Compiler}

 @vskip 50pt

 @center{Rhys Weatherley}

 @vskip 50pt
 @center{Copyright @copyright{} 2001, 2002 Southern Storm Software, Pty Ltd}
 @center{Copyright @copyright{} 2003 Free Software Foundation, Inc.}
 @end titlepage

 @c -----------------------------------------------------------------------

 @node Top, Overview, , (dir)
 @menu
 * Overview::                Treecc in brief
 * Expression Example::      A simple example of using treecc
 * Invoking Treecc::         Invoking treecc from the command-line
 * Syntax::                  Syntax of input files
 * Line Tracking::           Tracking line numbers in source files
 * Output APIs::             API's that are available in the generated output
 * Full Expression Example:: Full code for the expression example
 * EBNF Syntax::             Full EBNF syntax for treecc input files
 * Index::                   Index of concepts and facilities
 @end menu

 @c -----------------------------------------------------------------------

 @node Overview, Expression Example, Top, Top
 @chapter Overview
 @cindex Overview

 @section Introduction

 Traditional compiler construction tools such as lex and yacc focus on
 the lexical analysis and parsing phases of compilation.  But they
 provide very little to support semantic analysis and code generation.

 Yacc allows grammar rules to be tagged with semantic actions and values,
 but it doesn't provide any routines that assist in the process of tree
 building, semantic analysis, or code generation.  Because those processes
 are language-specific, yacc leaves the details to the programmer.

 Support for semantic analysis was also a lot simpler in the languages
 that were prevalent when lex and yacc were devised.  C and Pascal
 require declare before use, which allows the semantic information
 about a statement to be determined within the parser at the point of
 use.@footnote{K&R C did allow functions that weren't declared to be called,
 but only if they returned an "int".  This allowed the compiler to
 guess the declaration if it wasn't available, and to proceed as
 though all symbols were declared before use.}  If extensive optimization
 is not required, then code generation can also be performed within
 the grammar, leading to a simple one-pass compiler structure.

 Modern languages allow deferred declaration of methods, fields, and
 types.  For example, Java allows a method to refer to a field that
 is declared further down the .java source file.  A field can be
 declared with a type whose class definition has not yet been parsed.

 Hence, most of the semantic analysis that used to be performed inline
 within a yacc grammar must now be performed after the entire program
 has been parsed.  Tree building and walking is now more important
 than it was in older declare before use languages.

 @section Tree walking: the need for something better

 Building parse tree data structures and walking them is not terribly
 difficult, but it is extremely time-consuming and error-prone.  A
 modern programming language may have hundreds of node types, divided
 into categories for statements, expressions, types, declarations, etc.
 When a new programming language is being devised, new node types may
 be added quite frequently.  This has ramifications in trying to manage
 the code's complexity.@footnote{Implementing an existing programming
 language has the same problems as a new language.  Most programming
 languages are too large to be implemented all at once, and so the problem
 must be tackled in stages.  These stages are very similar to those the
 programmer goes through implementing a new language.}

 For example, consider nodes that correspond to programming language
 types in a C-like language.  There will be node types for integer
 types, floating-point types, pointers, structures, functions, etc.
 There will be semantic analysis routines for testing types for
 equality, comparing types for coercions and casts, evaluating the
 size of a type for memory layout purposes, determining if the type
 falls into a general category such as "integer" or "pointer", etc.

 Let's say we wanted to add a new "128-bit integer" type to this
 language.  Adding a new node type is fairly straight-forward.
 But we also need to track down every place in the code where the
 compiler walks a type or deals with integers and add an appropriate
 case for the new type.  This is very error-prone.  Such code is
 likely to be split over many files, and good coding practices only
 help to a certain extent.

 This problem gets worse when new kinds of expressions and statements
 are added to the language.  The change not only affects semantic
 analysis, but also optimization and code generation.  Some compilers
 use multiple passes over the tree to perform optimization, with
 different algorithms used in each pass.  Code generation may use a
 number of different strategies, depending upon how an expression or
 statement is used.  If even one of these places is missed when the
 new node type is added, then there is the potential for a very nasty
 bug that may go unnoticed for months or years.

 Object-oriented languages such as C++ can help a bit in constructing
 robust tree structures.  The base class can declare abstract methods
 for any semantic analysis, optimization, or code generation routine
 that needs to be implemented for all members of the node category.
 But another code maintainence problem arises.  What happens when
 we want to add a new optimization pass in the future?  We must go
 into hundreds of classes and implement the methods.

 To avoid changing hundreds of classes, texts on Design Patterns
 suggest using a Visitor pattern.  Then the new optimization pass
 can be encapsulated in a visitor.  This would work, except for
 the following drawback of visitor patterns, as described in Gamma,
 et al:

 @quotation
 @emph{The Visitor pattern makes it hard to add new subclasses of
 Element.  Each new ConcreteElement gives rise to a new abstract
 operation on Visitor and a corresponding implementation in
 every ConcreteVisitor class.}

 @emph{... The Visitor class hierarchy can be difficult to maintain
 when new ConcreteElement classes are added frequently.  In such
 cases, it's probably easier just to define operations on the
 classes that make up the structure.}
 @end quotation

 That is, if we add a new node type in the future, we have a large
 maintainence problem on our hands.  The solution is to scatter the
 implementation through-out every class, which is the situation we
 were trying to avoid by using the Visitor pattern.

 Because compiler construction deals with a large set of rapidly
 changing node types and operations, neither of the usual approaches
 work very well.

 The ideal programming language for designing compilers needs to have
 some way to detect when the programmer forgets to implement an operation
 for a new node type, and to ensure that a new operation covers all
 existing node types adequately.  Existing OO languages do not perform
 this kind of global error checking.  What few checking procedures they
 have change the maintainence problem into a different problem of
 similar complexity.

 @section Aspect-oriented programming

 A new field in language design has emerged in recent years called
 "Aspect-Oriented Programming" (AOP).  A good review of the field
 can be found in the October 2001 issue of the @emph{Communications of
 the ACM}, and on the AspectJ Web site, @url{http://www.aspectj.org/}.

 The following excerpt from the introduction to the AOP section in the
 CACM issue describes the essential aspects of AOP, and the difference
 between OOP and AOP:

 @quotation
 @emph{AOP is based on the idea that computer systems are better programmed
 by separately specifying the various concerns (properties or areas
 of interest) of a system and some description of their relationships,
 and then relying on mechanisms in the underlying AOP environment to
 weave or compose them together into a coherent program. ...
 While the tendancy in OOP's is to find commonality among classes
 and push it up the inheritance tree, AOP attempts to realize
 scattered concerns as first-class elements, and eject them
 horizontally from the object structure.}
 @end quotation

 Aspect-orientation gives us some hope of solving our compiler
 complexity problems.  We can view each operation on node types
 (semantic analysis, optimization, code generation, etc) as an
 "aspect" of the compiler's construction.  The AOP language weaves
 these aspects with the node types to create the final compiler.

 @section The treecc approach

 We don't really want to implement a new programming language
 just for compiler construction.  Especially since the new language's
 implementation would have all of the problems described above and would
 therefore also be difficult to debug and maintain.

 The approach that we take with "treecc" is similar to that used by
 "yacc".  A simple rule-based language is devised that is used to describe
 the intended behaviour declaratively.  Embedded code is used to provide
 the specific implementation details.  A translator then converts the input
 into source code that can be compiled in the usual fashion.

 The translator is responsible for generating the tree building and
 walking code, and for checking that all relevant operations have been
 implemented on the node types.  Functions are provided that make
 it easier to build and walk the tree data structures from within
 a "yacc" grammar and other parts of the compiler.

 @c -----------------------------------------------------------------------

 @node Expression Example, Invoking Treecc, Overview, Top
 @chapter A simple example for expressions
 @cindex Expression example

 Consider the following yacc grammar for a simple expression language:

 @example
 %token INT FLOAT

 %%

 expr: INT
     | FLOAT
     | '(' expr ')'
     | expr '+' expr
     | expr '-' expr
     | expr '*' expr
     | expr '/' expr
     | '-' expr
     ;
 @end example

 (We will ignore the problems of precedence and associativity and
 assume that the reader is familiar with how to resolve such issues
 in yacc grammars).

 There are 7 types of nodes for this grammar: @samp{intnum}, @samp{floatnum},
 @samp{plus}, @samp{minus}, @samp{multiply}, @samp{divide}, and @samp{negate}.
 They are defined in treecc as follows:

 @example
 %node expression %abstract %typedef

 %node binary expression %abstract =
 @{
     expression *expr1;
     expression *expr2;
 @}

 %node unary expression %abstract =
 @{
     expression *expr;
 @}

 %node intnum expression =
 @{
     int num;
 @}

 %node floatnum expression =
 @{
     float num;
 @}

 %node plus binary
 %node minus binary
 %node multiply binary
 %node divide binary
 %node negate unary
 @end example

 We have introduced three extra node types that refer
 to any expression, binary expressions, and unary expressions.  These
 can be seen as superclasses in an OO-style framework.  We have
 declared these node types as @samp{abstract} because the yacc grammar
 will not be permitted to create instances of these classes directly.

 The @samp{binary}, @samp{unary}, @samp{intnum}, and @samp{floatnum}
 node types have field definitions associated with them.  These have
 a similar syntax to C @code{struct} declarations.

 The yacc grammar is augmented as follows to build the parse tree:

 @example
 %union @{
     expression *node;
     int         inum;
     float       fnum;
 @}

 %token INT FLOAT

 %type <node> expr
 %type <inum> INT
 %type <fnum> FLOAT

 %%

 expr: INT               @{ $$ = intnum_create($1); @}
     | FLOAT             @{ $$ = floatnum_create($1); @}
     | '(' expr ')'      @{ $$ = $2; @}
     | expr '+' expr     @{ $$ = plus_create($1, $3); @}
     | expr '-' expr     @{ $$ = minus_create($1, $3); @}
     | expr '*' expr     @{ $$ = multiply_create($1, $3); @}
     | expr '/' expr     @{ $$ = divide_create($1, $3); @}
     | '-' expr          @{ $$ = negate_create($2); @}
     ;
 @end example

 The treecc translator generates the @samp{*_create} functions so that
 the rest of the compiler can build the necessary data structures
 on demand.  The parameters to the @samp{*_create} functions
 are identical in type and order to the members of the structure for
 that node type.

 Because @samp{expression}, @samp{binary}, and @samp{unary} are abstract,
 there will be no @samp{*_create} functions associated with them.  This will
 help the programmer catch certain kinds of errors.

 The type that is returned from a @samp{*_create} function is the first
 superclass of the node that has a @samp{%typedef} keyword associated with it;
 @samp{expression *} in this case.

 @section Storing extra information

 Normally we will want to store extra information with a node beyond
 that which is extracted by the yacc grammar.  In our expression
 example, we probably want to store type information in the nodes
 so that we can determine if the whole expression is integer or
 floating point during semantic analysis.  We can add type information
 to the @samp{expression} node type as follows:

 @example
 %node expression %abstract %typedef =
 @{
     %nocreate type_code type;
 @}
 @end example

 The @samp{%nocreate} flag indicates that the field should not be passed
 to the @samp{*_create} functions as a parameter.  i.e. it provides semantic
 information that isn't present in the grammar.  When nodes are created,
 any fields that are declared as @samp{%nocreate} will be undefined in value.
 A default value can be specified as follows:

 @example
 %node expression %abstract %typedef =
 @{
     %nocreate type_code type = @{int_type@};
 @}
 @end example

 Default values must be enclosed in @samp{@{} and @samp{@}} because they are
 pieces of code in the underlying source language (C, C++, etc), instead
 of tokens in the treecc syntax.  Any legitimate expression in the
 underlying source language may be used.

 We also need to arrange for @samp{type_code} to be declared.  One way to
 do this is by adding a @samp{%decls} section to the front of the treecc
 input file:

 @example
 %decls %@{

 typedef enum
 @{
     int_type,
     float_type

 @} type_code;

 %@}
 @end example

 We could have introduced the definition by placing a @samp{#include}
 directive into the @samp{%decls} section instead, or by defining a
 treecc enumerated type:

 @example
 %enum type_code =
 @{
     int_type,
     float_type
 @}
 @end example

 Now that we have these definitions, type-inferencing can be implemented
 as follows:

 @example
 %operation void infer_type(expression *e)

 infer_type(binary)
 @{
     infer_type(e->expr1);
     infer_type(e->expr2);

     if(e->expr1->type == float_type || e->expr2->type == float_type)
     @{
         e->type = float_type;
     @}
     else
     @{
         e->type = int_type;
     @}
 @}

 infer_type(unary)
 @{
     infer_type(e->expr);
     e->type = e->expr->type;
 @}

 infer_type(intnum)
 @{
     e->type = int_type;
 @}
 @end example

 This example demonstrates using the abstract node types @samp{binary} and
 @samp{unary} to define operations on all subclasses.  The treecc translator
 will generate code for a full C function called @samp{infer_type} that
 incorporates all of the cases.

 But hang on a second!  What happened to @samp{floatnum}?  Where did it
 go?  It turns out that treecc will catch this.  It will report
 an error to the effect that @samp{node type `floatnum' is not handled in
 operation `infer_type'}.  Here is its definition:

 @example
 infer_type(floatnum)
 @{
     e->type = float_type;
 @}
 @end example

 As we can see, treecc has just caught a bug in the language
 implementation and reported it to us as soon as we introduced it.

 Let's now extend the language with a @samp{power} operator:

 @example
 yacc:

 expr: expr '^' expr     @{ $$ = create_power($1, $3); @}
     ;

 treecc:

 %node power binary
 @end example

 That's all there is to it!  When treecc re-translates the input
 file, it will modify the definition of @samp{infer_type} to include the
 extra case for @samp{power} nodes.  Because @samp{power} is a subclass of
 @samp{binary}, treecc already knows how to perform type inferencing for the
 new node and it doesn't warn us about a missing declaration.

 What if we wanted to restrict the second argument of @samp{power} to be
 an integer value?  We can add the following case to @samp{infer_type}:

 @example
 infer_type(power)
 @{
     infer_type(e->expr1);
     infer_type(e->expr2);

     if(e->expr2->type != int_type)
     @{
         error("second argument to `^' is not an integer");
     @}

     e->type = e->expr1->type;
 @}
 @end example

 The translator now notices that there is a more specific implementation
 of @samp{infer_type} for @samp{power}, and won't use the @samp{binary}
 case for it.

 The most important thing to realise here is that the translator always
 checks that there are sufficient declarations for @samp{infer_type} to cover
 all relevant node types.  If it detects a lack, it will immediately
 raise an error to the user.  This allows tree coverage problems to
 be found a lot sooner than with the traditional approach.

 @xref{Full Expression Example}, for a complete listing of the above
 example files.

 @c -----------------------------------------------------------------------

 @node Invoking Treecc, Syntax, Expression Example, Top
 @chapter Invoking treecc from the command-line
 @cindex Invoking treecc
 @cindex Command-line options

 The general form of treecc's command-line syntax is as follows:

 @example
 treecc [OPTIONS] INPUT ...
 @end example

 Treecc accepts the following command-line options:

 @table @code
 @item -o FILE
 @itemx --output FILE
 Set the name of the output file to @samp{FILE}.  If this option is not
 supplied, then the name of the first input file will be used, with its
 extension changed to @samp{.c}.  If the input is standard input,
 the default output file is @samp{yy_tree.c}.

 This option may be overridden using the @samp{%output} keyword in
 the input files.

 @item -h FILE
 @itemx --header FILE
 Set the name of the header output file to @samp{FILE}.  This is only
 used for the C and C++ output languages.  If this option is not supplied,
 then the name of the output file will be used, with its extension
 changed to @samp{.h}.  If the input is standard input, the default header
 output file is @samp{yy_tree.h}.

 This option may be overriden using the @samp{%header} keyword in the
 input files.  If this option is used with a language that does not require
 headers, it will be ignored.

 @item -d DIR
 @itemx --output-dir DIR
 Set the name of the Java output directory to @samp{DIR}.  This is only
 used for the Java language.  If this option is not supplied, then the
 directory corresponding to the first input file is used.  If the input
 is standard input, the default is the current directory.

 This option may be overriden using the @samp{%outdir} keyword in the
 input files.  If this option is used with a language other than Java,
 it will be ignored.

 @item -e EXT
 @itemx --extension EXT
 Change the default output file extension to @samp{ext}, instead of
 @samp{.c}.  The value @samp{ext} can have a leading dot, but this is
 not required.

 @item -f
 @itemx --force-create
 Treecc normally attempts to optimise the creation of output files
 so that they are only modified if a non-trivial change has
 occurred in the input.  This can reduce the number of source
 code recompiles when treecc is used in combination with make.

 This option forces the output files to be created, even if they
 are the same as existing files with the same name.

 The declaration @samp{%option force} can be used in the input files
 to achieve the same effect as this option.

 @item -O OPT
 @itemx --option OPT
 Set a treecc option value.  This is a command-line version of
 the @samp{%option} keyword in the input files.

 @item -n
 @itemx --no-output
 Suppress the generation of output files.  Treecc parses the
 input files, checks for errors, and then stops.

 @item --help
 Print a usage message for the treecc program.

 @item -v
 @itemx --version
 Print the version of the treecc program.

 @item --
 Marks the end of the command-line options, and the beginning of
 the input filenames.  You may need to use this if your filename
 begins with @samp{-}.  e.g. @samp{treecc -- -input.tc}.  This is
 not needed if the input is standard input: @samp{treecc -}
 is perfectly valid.
 @end table

 @c -----------------------------------------------------------------------

 @node Syntax, Nodes, Invoking Treecc, Top
 @chapter Syntax of input files
 @cindex Syntax

 Treecc input files consist of zero or more declarations that define
 nodes, operations, options, etc.  The following sections describe each
 of these elements.

 @menu
 * Nodes::          Node declarations
 * Types::          Types used in fields and parameters
 * Enumerations::   Enumerated type declarations
 * Operations::     Operation declarations
 * Options::        Options that modify treecc's behaviour
 * Literal Code::   Literal code declarations
 * Changing Files:: Changing input and output files
 @end menu

 @c -----------------------------------------------------------------------

 @node Nodes, Types, Syntax, Syntax
 @section Node declarations
 @cindex Nodes
 @cindex %node keyword
 @cindex Fields

 Node types are defined using the @samp{node} keyword in input files.
 The general form of the declaration is:

 @example
 %node NAME [ PNAME ] [ FLAGS ] [ = FIELDS ]
 @end example

 @table @samp
 @item NAME
 An identifier that is used to refer to the node type elsewhere
 in the treecc definition.  It is also the name of the type that will be
 visible to the programmer in literal code blocks.

 @item PNAME
 An identifier that refers to the parent node type that @samp{NAME} inherits
 from.  If @samp{PNAME} is not supplied, then @samp{NAME} is a top-level
 declaration.  It is legal to supply a @samp{PNAME} that has not yet
 been defined in the input.

 @item FLAGS
 Any combination of @samp{%abstract} and @samp{%typedef}:

 @table @samp
 @item %abstract
 @cindex %abstract keyword
 The node type cannot be constructed by the programmer.  In addition,
 the programmer does not need to define operation cases for this node
 type if all subtypes have cases associated with them.

 @item %typedef
 @cindex %typedef keyword
 The node type is used as the common return type for node creation
 functions.  Top-level declarations must have a @samp{%typedef} keyword.
 @end table
 @end table

 The @samp{FIELDS} part of a node declaration defines the fields that
 make up the node type.  Each field has the following general form:

 @example
 [ %nocreate ] TYPE FNAME [ = VALUE ] ';'
 @end example

 @table @samp
 @item %nocreate
 @cindex %nocreate keyword
 The field is not used in the node's constructor.  When the node is
 constructed, the value of this field will be undefined unless
 @samp{VALUE} is specified.

 @item TYPE
 The type that is associated with the field.  Types can be declared
 using a subset of the C declaration syntax, augmented with some C++
 and Java features.  @xref{Types}, for more information.

 @item FNAME
 The name to associate with the field.  Treecc verifies that the field
 does not currently exist in this node type, or in any of its ancestor
 node types.

 @item VALUE
 The default value to assign to the field in the node's constructor.
 This can only be used on fields that are declared with @samp{%nocreate}.
 The value must be enclosed in braces.  For example @samp{@{NULL@}} would
 be used to initialize a field with @samp{NULL}.

 The braces are required because the default value is expressed in
 the underlying source language, and can use any of the usual constant
 declaration features present in that language.
 @end table

 When the output language is C, treecc creates a struct-based type
 called @samp{NAME} that contains the fields for @samp{NAME} and
 all of its ancestor classes.  The type also contains some house-keeping
 fields that are used internally by the generated code.  The following
 is an example:

 @example
 typedef struct binary__ binary;
 struct binary__ @{
     const struct binary_vtable__ *vtable__;
     int kind__;
     char *filename__;
     long linenum__;
     type_code type;
     expression * expr1;
     expression * expr2;
 @};
 @end example

 The programmer should avoid using any identifier that
 ends with @samp{__}, because it may clash with house-keeping
 identifiers that are generated by treecc.

 When the output language is C++, Java, or C#, treecc creates a class
 called @samp{NAME}, that inherits from the class @samp{PNAME}.
 The field definitions for @samp{NAME} are converted into public members
 in the output.

 @c -----------------------------------------------------------------------

 @node Types, Enumerations, Nodes, Syntax
 @section Types used in fields and parameters
 @cindex Types

 Types that are used in field and parameter declarations have a
 syntax which is subset of features found in C, C++, and Java:

 @example
 TypeAndName ::= Type [ IDENTIFIER ]

 Type ::= TypeName
        | Type '*'
        | Type '&'
        | Type '[' ']'

 TypeName ::= IDENTIFIER @{ IDENTIFIER @}
 @end example

 Types are usually followed by an identifier that names the field or
 parameter.  The name is required for fields and is optional for parameters.
 For example @samp{int} is usually equivalent to @samp{int x} in parameter
 declarations.

 The following are some examples of using types:

 @example
 int
 int x
 const char *str
 expression *expr
 Element[][] array
 Item&
 unsigned int y
 const Element
 @end example

 The grammar used by treecc is slightly ambiguous.  The last example above
 declares a parameter called @samp{Element}, that has type @samp{const}.
 The programmer probably intended to declare an anonymous parameter with type
 @samp{const Element} instead.

 This ambiguity is unavoidable given that treecc is not fully
 aware of the underlying language's type system.  When treecc
 sees a type that ends in a sequence of identifiers, it will
 always interpret the last identifier as the field or parameter
 name.  Thus, the programmer must write the following instead:

 @example
 const Element e
 @end example

 Treecc cannot declare types using the full power of C's type system.
 The most common forms of declarations are supported, and the rest
 can usually be obtained by defining a @samp{typedef} within a
 literal code block.  @xref{Literal Code}, for more information
 on literal code blocks.

 It is the responsibility of the programmer to use type constructs
 that are supported by the underlying programming language.  Types such
 as @samp{const char *} will give an error when the output is compiled
 with a Java compiler, for example.

 @c -----------------------------------------------------------------------

 @node Enumerations, Operations, Types, Syntax
 @section Enumerated type declarations
 @cindex Enumerations
 @cindex enum declaration
 @cindex %enum keyword

 Enumerated types are a special kind of node type that can be used
 by the programmer for simple values that don't require a full abstract
 syntax tree node.  The following is an example of defining a list
 of the primitive machine types used in a Java virtual machine:

 @example
 %enum JavaType =
 @{
     JT_BYTE,
     JT_SHORT,
     JT_CHAR,
     JT_INT,
     JT_LONG,
     JT_FLOAT,
     JT_DOUBLE,
     JT_OBJECT_REF
 @}
 @end example

 Enumerations are useful when writing code generators and type
 inferencing routines.  The general form is:

 @example
 %enum NAME = @{ VALUES @}
 @end example

 @table @samp
 @item NAME
 An identifier to be used to name the enumerated type.  The name must
 not have been previously used as a node type, an enumerated type, or
 an enumerated value.

 @item VALUES
 A comma-separated list of identifiers that name the values within
 the enumeration.  Each of the names must be unique, and must not have
 been used previously as a node type, an enumerated type, or an
 enumerated value.
 @end table

 Logically, each enumerated value is a special node type that inherits from
 a parent node type corresponding to the enumerated type @samp{NAME}.

 When the output language is C or C++, treecc generates an enumerated
 typedef for @samp{NAME} that contains the enumerated values in the
 same order as was used in the input file.  The typedef name can be
 used elsewhere in the code as the type of the enumeration.

 When the output language is Java, treecc generates a class called
 @samp{NAME} that contains the enumerated values as integer constants.
 Elsewhere in the code, the type @samp{int} must be used to declare
 variables of the enumerated type.  Enumerated values are referred
 to as @samp{NAME.VALUE}.  If the enumerated type is used as a trigger
 parameter, then @samp{NAME} must be used instead of @samp{int}:
 treecc will convert the type when the Java code is output.

 When the output language is C#, treecc generates an enumerated value
 type called @samp{NAME} that contains the enumerated values as
 members.  The C# type @samp{NAME} can be used elsewhere in the code
 as the type of the enumeration.  Enumerated values are referred to
 as @samp{NAME.VALUE}.

 @c -----------------------------------------------------------------------

 @node Operations, Options, Enumerations, Syntax
 @section Operation declarations
 @cindex Operations
 @cindex operation declarations
 @cindex operation cases
 @cindex %operation keyword
 @cindex trigger parameters

 Operations are declared in two parts: the declaration, and the
 cases.  The declaration part defines the prototype for the
 operation and the cases define how to handle specific kinds of
 nodes for the operation.

 Operations are defined over one or more trigger parameters.  Each
 trigger parameter specifies a node type or an enumerated type that
 is selected upon to determine what course of action to take.  The
 following are some examples of operation declarations:

 @example
 %operation void infer_type(expression *e)
 %operation type_code common_type([type_code t1], [type_code t2])
 @end example

 Trigger parameters are specified by enclosing them in square
 brackets.  If none of the parameters are enclosed in square
 brackets, then treecc assumes that the first parameter is the
 trigger.

 The general form of an operation declaration is as follows:

 @example
 %operation @{ %virtual | %inline | %split @} RTYPE [CLASS::]NAME(PARAMS)
 @end example

 @table @samp
 @item %virtual
 @cindex %virtual keyword
 Specifies that the operation is associated with a node type as
 a virtual method.  There must be only one trigger parameter,
 and it must be the first parameter.

 Non-virtual operations are written to the output source files
 as global functions.

 @item %inline
 @cindex %inline keyword
 Optimise the generation of the operation code so that all cases
 are inline within the code for the function itself.  This can
 only be used with non-virtual operations, and may improve
 code efficiency if there are lots of operation cases with a
 small amount of code in each.

 @item %split
 @cindex %split keyword
 Split the generation of the multi-trigger operation code across
 multiple functions, to reduce the size of each individual function.
 It is sometimes necessary to split large @code{%inline} operations
 to avoid compiler limits on function size.

 @item RTYPE
 The type of the return value for the operation.  This should be
 @samp{void} if the operation does not have a return value.

 @item CLASS
 The name of the class to place the operation's definition within.
 This can only be used with non-virtual operations, and is
 intended for languages such as Java and C# that cannot declare
 methods outside of classes.  The class name will be ignored if
 the output language is C.

 If a class name is required, but the programmer did not supply it,
 then @samp{NAME} will be used as the default.  The exception to
 this is the C# language: @samp{CLASS} must always be supplied and
 it must be different from @samp{NAME}.  This is due to a "feature"
 in some C# compilers that forbid a method with the same name as
 its enclosing class.

 @item NAME
 The name of the operation.

 @item PARAMS
 The parameters to the operation.  Trigger parameters may be
 enclosed in square brackets.  Trigger parameters must be
 either node types or enumerated types.
 @end table

 Once an operation has been declared, the programmer can specify
 its cases anywhere in the input files.  It is not necessary that
 the cases appear after the operation, or that they be contiguous
 within the input files.  This permits the programmer to place
 operation cases where they are logically required for maintainence
 reasons.

 There must be sufficient operation cases defined to cover every
 possible combination of node types and enumerated values that
 inherit from the specified trigger types.  An operation case
 has the following general form:

 @example
 NAME(TRIGGERS) [, NAME(TRIGGERS2) ...]
 @{
     CODE
 @}
 @end example

 @table @samp
 @item NAME
 The name of the operation for which this case applies.

 @item TRIGGERS
 A comma-separated list of node types or enumerated values that
 define the specific case that is handled by the following code.

 @item CODE
 Source code in the output source language that implements the
 operation case.
 @end table

 Multiple trigger combinations can be associated with a single
 block of code, by listing them all, separated by commas.  For
 example:

 @example
 common_type(int_type, int_type)
 @{
     return int_type;
 @}

 common_type(int_type, float_type),
 common_type(float_type, int_type),
 common_type(float_type, float_type)
 @{
     return float_type;
 @}
 @end example

 @c -----------------------------------------------------------------------

 @node Options, Literal Code, Operations, Syntax
 @section Options that modify treecc's behaviour
 @cindex Options
 @cindex option declaration
 @cindex %option keyword

 "(*)" is used below to indicate an option that is enabled by default.

 @table @samp
 @item %option track_lines
 @cindex track_lines option
 Enable the generation of code that can track the current filename and
 line number when nodes are created.  @xref{Line Tracking}, for more
 information. (*)

 @item %option no_track_lines
 @cindex no_track_lines option
 Disable the generation of code that performs line number tracking.

 @item %option singletons
 @cindex singletons option
 Optimise the creation of singleton node types.  These are
 node types without any fields.  Treecc can optimise the code
 so that only one instance of a singleton node type exists in
 the system.  This can speed up the creation of nodes for
 constants within compilers. (*)

 Singleton optimisations will have no effect if @samp{track_lines}
 is enabled, because line tracking uses special hidden fields in
 every node.

 @item %option no_singletons
 @cindex no_singletons option
 Disable the optimisation of singleton node types.

 @item %option reentrant
 @cindex reentrant option
 Enable the generation of reentrant code that does not rely
 upon any global variables.  Separate copies of the compiler
 state can be used safely in separate threads.  However, the
 same copy of the compiler state cannot be used safely in two or
 more threads.

 @item %option no_reentrant
 @cindex no_reentrant option
 Disable the generation of reentrant code.  The interface to
 node management functions is simpler, but cannot be used
 in a threaded environment. (*)

 @item %option force
 @cindex force option
 Force output source files to be written, even if they are
 unchanged.  This option can also be set using the @samp{-f}
 command-line option.

 @item %option no_force
 @cindex no_force option
 Don't force output source files to be written if they are the
 same as before. (*)

 This option can help smooth integration of treecc with make.
 Only those output files that have changed will be modified.
 This reduces the number of files that the underlying source
 language compiler must process after treecc is executed.

 @item %option virtual_factory
 @cindex virtual_factory option
 Use virtual methods in the node type factories, so that the
 programmer can subclass the factory and provide new
 implementations of node creation functions.  This option is
 ignored for C, which does not use factories.

 @item %option no_virtual_factory
 @cindex no_virtual_factory option
 Don't use virtual methods in the node type factories. (*)

 @item %option abstract_factory
 @cindex abstract_factory option
 Use abstract virtual methods in the node type factories.
 The programmer is responsible for subclassing the factory
 to provide node creation functionality.

 @item %option no_abstract_factory
 @cindex no_abstract_factory option
 Don't use abstract virtual methods in the node type factories. (*)

 @item %option kind_in_node
 @cindex kind_in_node option
 Put the kind field in the node, for more efficient access at runtime. (*)

 @item %option kind_in_vtable
 @cindex kind_in_vtable option
 Put the kind field in the vtable, and not the node.  This saves some
 memory, at the cost of slower access to the kind value at runtime.
 This option only applies when the language is C.  The kind field is
 always placed in the node in other languages, because it isn't possible
 to modify the vtable.

 @item %option prefix = PREFIX
 @cindex prefix option
 Specify the prefix to be used in output files in place of "yy".

 @item %option state_type = NAME
 @cindex state_type option
 Specify the name of the state type.  The state type is generated
 by treecc to perform centralised memory management and reentrancy
 support.  The default value is @samp{YYNODESTATE}.  If the output language
 uses factories, then this will also be the name of the factory
 base class.

 @item %option namespace = NAME
 @cindex namespace option
 Specify the namespace to write definitions to in the output
 source files.  This option is ignored when the output language
 is C.

 @item %option package = NAME
 @cindex package option
 Same as @samp{%option namespace = NAME}.  Provided because @samp{package}
 is more natural for Java programmers.

 @item %option base = NUM
 @cindex base option
 Specify the numeric base to use for allocating numeric values to
 node types.  By default, node type allocation begins at 1.

 @item %option lang = LANGUAGE
 @cindex lang option
 Specify the output language.  Must be one of @code{"C"}, @code{"C++"},
 @code{"Java"}, or @code{"C#"}.  The default is @code{"C"}.

 @item %option block_size = NUM
 @cindex block_size option
 Specify the size of the memory blocks to use in C and C++ node allocators.

 @item %option strip_filenames
 @cindex strip_filenames option
 Strip filenames down to their base name in @code{#line} directives.
 i.e. strip off the directory component.  This can be helpful in
 combination with the @code{%include %readonly} command when
 treecc input files may processed from different directories,
 causing common output files to change unexpectedly.

 @item %option no_strip_filenames
 @cindex no_strip_filenames option
 Don't strip filenames in @code{#line} directives. (*)

 @item %option internal_access
 @cindex internal_access option
 Use @code{internal} as the access mode for classes in C#, rather than
 @code{public}.

 @item %option public_access
 @cindex public_access option
 Use @code{public} as the access mode for classes in C#, rather than
 @code{internal}. (*)

 @item %option print_lines
 @cindex print_lines option
 Print @code{#line} markers in languages that use them. (*)

 @item %option no_print_lines
 @cindex no_print_lines option
 Do not print @code{#line} markers, even in languages that normally
 use them.

 @item %option allocator
 @cindex allocator option
 Use treecc's standard node allocator for C and C++.  This option has
 no effect for other output languages. (*)

 @item %option no_allocator
 @cindex no_allocator option
 Do not use treecc's standard node allocator for C and C++.  This can be
 useful when the programmer wants to redirect node allocation to their
 own routines.

 @item %option gc_allocator
 @cindex gc_allocator option
 Use libgc as a garbage-collecting node allocator for C and C++.  This
 option has no effect for other output languages.

 @item %option no_gc_allocator
 @cindex no_gc_allocator option
 Do not use libgc as a garbage-collecting node allocator for C and C++. (*)

 @item %option base_type
 @cindex base_type option
 Specify the base type for the root node of the treecc node heirarchy.
 The default is no base type.

 @end table

 @c -----------------------------------------------------------------------

 @node Literal Code, Changing Files, Options, Syntax
 @section Literal code declarations
 @cindex Literal code

 Sometimes it is necessary to embed literal code within output @samp{.h}
 and source files.  Usually this is to @samp{#include} definitions
 from other files, or to define functions that cannot be easily expressed
 as operations.

 A literal code block is specified by enclosing it in @samp{%@{} and
 @samp{%@}}.  The block can also be prefixed with the following flags:

 @table @samp
 @item %decls
 @cindex %decls keyword
 Write the literal code to the currently active declaration header file,
 instead of the source file.

 @item %both
 @cindex %both keyword
 Write the literal code to both the currently active declaration header file
 and the currently active source file.

 @item %end
 @cindex %end keyword
 Write the literal code to the end of the file, instead of the beginning.
 @end table

 Another form of literal code block is one which begins with @samp{%%} and
 extends to the end of the current input file.  This form implicitly has
 the @samp{%end} flag.

 @c -----------------------------------------------------------------------

 @node Changing Files, Line Tracking, Literal Code, Syntax
 @section Changing input and output files
 @cindex Changing files

 Most treecc compiler definitions will be too large to be manageable
 in a single input file.  They also will be too large to write to a
 single output file, because that may overload the source language
 compiler.

 Multiple input files can be specified on the command-line, or
 they can be explicitly included by other input files with
 the following declarations:

 @table @samp
 @item %include [ %readonly ] FILENAME
 @cindex include declaration
 @cindex %include keyword
 @cindex %readonly keyword
 Include the contents of the specified file at the current point
 within the current input file.  @samp{FILENAME} is interpreted
 relative to the name of the current input file.

 If the @samp{%readonly} keyword is supplied, then any output
 files that are generated by the included file must be read-only.
 That is, no changes are expected by performing the inclusion.

 The @samp{%readonly} keyword is useful for building compilers
 in layers.  The programmer may group a large number of useful
 node types and operations together that are independent of the
 particulars of a given language.  The programmer then defines
 language-specific compilers that "inherit" the common definitions.

 Read-only inclusions ensure that any extensions that are added
 by the language-specific parts do not "leak" into the common code.
 @end table

 Output files can be changed using the follow declarations:

 @table @samp
 @item %header FILENAME
 @cindex header declaration
 @cindex %header keyword
 Change the currently active declaration header file to @samp{FILENAME},
 which is interpreted relative to the current input file.  This option
 has no effect for languages without header files (Java and C#).

 Any node types and operations that are defined after a @samp{%header}
 declaration will be declared in @samp{FILENAME}.

 @item %output FILENAME
 @cindex output declaration
 @cindex %output keyword
 Change the currently active source file to @samp{FILENAME},
 which is interpreted relative to the current input file.  This option
 has no effect for languages that require a single class per file (Java).

 Any node types and operations that are defined after a @samp{%header}
 declaration will have their implementations placed in @samp{FILENAME}.

 @item %outdir DIRNAME
 @cindex outdir declaration
 @cindex %outdir keyword
 Change the output source directory to @samp{DIRNAME}.  This is only
 used for Java, which requires that a single file be used for each class.
 All classes are written to the specified directory.  By default,
 @samp{DIRNAME} is the current directory where treecc was invoked.
 @end table

 When treecc generates the output source code, it must insert several
 common house-keeping functions and classes into the code.  By default,
 these are written to the first header and source files.  This can
 be changed with the @samp{%common} declaration:

 @table @samp
 @item %common
 @cindex common declaration
 @cindex %common keyword
 Output the common house-keeping code to the currently active
 declaration header file and the currently active source file.
 This is typically used as follows:

 @example
 %header "common.h"
 %output "common.c"
 %common
 @end example
 @end table

 @c -----------------------------------------------------------------------

 @node Line Tracking, Output APIs, Changing Files, Top
 @chapter Tracking line numbers in source files
 @cindex Line tracking

 When compilers emit error messages to the programmer, it is generally
 a good idea to indicate which file and which line gave rise to the
 error.  Syntax errors can be emitted fairly easily because the parser
 usually has access to the current line number.  However, semantic
 errors are harder to report because the parser may no longer be
 active when the error is detected.

 Treecc can generate code that automatically keeps track of what line
 in the source file was active when a node is created.  Every node
 has two extra private fields that specify the name of the file and the
 line number.  Semantic analysis routines can query this information
 when reporting errors.

 Because treecc is not aware of how to obtain this information, the
 programmer must supply some additional functions.  @xref{Output APIs},
 for more information.

 @xref{Output APIs}, for more information.

 @c -----------------------------------------------------------------------

 @node Output APIs, C Language, Line Tracking, Top
 @chapter API's available in the generated output
 @cindex Output APIs

 The source code that is generated by treecc exports a number of
 application programmer interfaces (API's) to the programmer.  These
 can be used elsewhere in the compiler implementation to manipulate
 abstract syntax trees.  The following sections describe the API's
 for each of the output languages.

 @menu
 * C Language::     C Language API's
 * C++ Language::   C++ Language API's
 * Java Language::  Java Language API's
 * C# Language::    C# Language API's
 * Ruby Language::  Ruby Language API's
 @end menu

 @c -----------------------------------------------------------------------

 @node C Language, C++ Language, Output APIs, Output APIs
 @section C Language APIs
 @cindex C APIs

 In the C output language, each node type is converted into a @samp{typedef}
 that contains the node's fields, and the fields of its ancestor node
 types.  The following example demonstrates how treecc node declarations
 are converted into C source code:

 @example
 %node expression %abstract %typedef =
 @{
     %nocreate type_code type;
 @}
 %node binary expression %abstract =
 @{
     expression *expr1;
     expression *expr2;
 @}
 %node plus binary
 @end example

 becomes:

 @example
 typedef struct expression__ expression;
 typedef struct binary__ binary;
 typedef struct plus__ plus;

 struct expression__ @{
     const struct expression_vtable__ *vtable__;
     int kind__;
     char *filename__;
     long linenum__;
     type_code type;
 @};
 struct binary__ @{
     const struct binary_vtable__ *vtable__;
     int kind__;
     char *filename__;
     long linenum__;
     type_code type;
     expression * expr1;
     expression * expr2;
 @};
 struct plus__ @{
     const struct plus_vtable__ *vtable__;
     int kind__;
     char *filename__;
     long linenum__;
     type_code type;
     expression * expr1;
     expression * expr2;
 @};
 @end example

 Programmers should avoid using any identifiers that end in
 @samp{__}.  Such identifiers are reserved for internal use by treecc
 and its support routines.

 For each non-abstract node type called @samp{NAME}, treecc generates a
 function called @samp{NAME_create} that creates nodes of that type.
 The general form of the function's prototype is as follows:

 @example
 TYPE *NAME_create([YYNODESTATE *state,] PARAMS)
 @end example

 @table @samp
 @item TYPE
 The return node type, which is the nearest ancestor that has the
 @samp{%typedef} flag.

 @item NAME
 The name of the node type that is being created.

 @item state
 The system state, if reentrant code is being generated.

 @item PARAMS
 The create parameters, consisting of every field that does not
 have the @samp{%nocreate} flag.  The parameters appear in the
 same order as the fields in the node types, from the top-most
 ancestor down to the node type itself.  For example:

 @example
 expression *plus_create(expression * expr1, expression * expr2);
 @end example
 @end table

 Enumerated types are converted into a C @samp{typedef} with the
 same name and values:

 @example
 %enum JavaType =
 @{
     JT_BYTE,
     JT_SHORT,
     JT_CHAR,
     JT_INT,
     JT_LONG,
     JT_FLOAT,
     JT_DOUBLE,
     JT_OBJECT_REF
 @}
 @end example

 becomes:

 @example
 typedef enum
 @{
     JT_BYTE,
     JT_SHORT,
     JT_CHAR,
     JT_INT,
     JT_LONG,
     JT_FLOAT,
     JT_DOUBLE,
     JT_OBJECT_REF

 @} JavaType;
 @end example

 Virtual operations are converted into C macros that invoke the
 correct vtable entry on a node type:

 @example
 %operation %virtual void infer_type(expression *e)
 @end example

 becomes:

 @example
 #define infer_type(this__) \
     ((*(((struct expression_vtable__ *) \
         ((this__)->vtable__))->infer_type_v__)) \
             ((expression *)(this__)))
 @end example

 Calls to @samp{infer_type} can then be made with @samp{infer_type(node)}.

 Non-virtual operations are converted into C functions:

 @example
 %operation void infer_type(expression *e)
 @end example

 becomes:

 @example
 extern void infer_type(expression *e);
 @end example

 Because virtual and non-virtual operations use a similar call syntax,
 it is very easy to convert a virtual operation into a non-virtual
 operation when the output language is C.  This isn't possible with
 the other output languages.

 Other house-keeping tasks are performed by the following functions
 and macros.  Some of these must be supplied by the programmer.
 The @samp{state} parameter is required only if a reentrant compiler is
 being built.

 @table @code
 @item int yykind(ANY *node)
 @cindex yykind macro
 Gets the numeric kind value associated with a particular node.
 The kind value for node type @samp{NAME} is called @samp{NAME_kind}.

 @item const char *yykindname(ANY *node)
 @cindex yykindname macro
 Gets the name of the node kind associated with a particular node.
 This may be helpful for debugging and logging code.

 @item int yyisa(ANY *node, type)
 @cindex yyisa macro
 Determines if @samp{node} is an instance of the node type @samp{type}.

 @item char *yygetfilename(ANY *node)
 @cindex yygetfilename macro
 Gets the filename corresponding to where @samp{node} was created
 during parsing.  This macro is only generated if @samp{%option track_lines}
 was specified.

 @item long yygetlinenum(ANY *node)
 @cindex yygetlinenum macro
 Gets the line number corresponding to where @samp{node} was created
 during parsing.  This macro is only generated if @samp{%option track_lines}
 was specified.

 @item void yysetfilename(ANY *node, char *value)
 @cindex yysetfilename macro
 Sets the filename associated with @samp{node} to @samp{value}.  The
 string is not copied, so @samp{value} must persist for the lifetime
 of the node.  This macro will rarely be required, unless a node
 corresponds to a different line than the current parse line.  This
 macro is only generated if @samp{%option track_lines} was specified.

 @item void yysetlinenum(ANY *node, long value)
 @cindex yysetlinenum macro
 Sets the line number associated with @samp{node} to @samp{value}.
 This macro will rarely be required, unless a node corresponds to a
 different line than the current parse line.  This macro is only
 generated if @samp{%option track_lines} was specified.

 @item char *yycurrfilename([YYNODESTATE *state])
 @cindex yycurrfilename function
 Get the name of the current input file from the parser.  The pointer
 that is returned from this function is stored as-is: the string is
 not copied.  Therefore, the value must persist for at least as long
 as the node will persist.  This function must be supplied by the programmer
 if @samp{%option track_lines} was specified.

 @item long yycurrlinenum([YYNODESTATE *state])
 @cindex yycurrlinenum function
 Get the number of the current input line from the parser.  This
 function must be supplied by the programmer if @samp{%option track_lines}
 was specified.

 @item void yynodeinit([YYNODESTATE *state])
 @cindex yynodeinit function
 Initializes the node memory manager.  If the system is reentrant, then
 the node memory manager is @samp{state}.  Otherwise a global node
 memory manager is used.

 @item void *yynodealloc([YYNODESTATE *state,] unsigned int size)
 @cindex yynodealloc function
 Allocates a block of memory of @samp{size} bytes in size from the
 node memory manager.  This function is called automatically from
 the node-specific @samp{*_create} functions.  The programmer will
 not normally need to call this function.

 This function will return @code{NULL} if the system is out of
 memory, or if @samp{size} is too large to be allocated within
 the node memory manager.  If the system is out of memory, then
 @samp{yynodealloc} will call @samp{yynodefailed} prior to
 returning @code{NULL}.

 @item int yynodepush([YYNODESTATE *state])
 @cindex yynodepush function
 Pushes the current node memory manager position.  The next time
 @code{yynodepop} is called, the node memory manager will reset to
 the pushed position.  This function returns zero if the system
 is out of memory.

 @item void yynodepop([YYNODESTATE *state])
 @cindex yynodepop function
 Pops the current node memory manager position.  This function has
 no effect if @code{yynodepush} was not called previously.

 The @code{yynodepush} and @code{yynodepop} functions can be used
 to perform a simple kind of garbage collection on nodes.  When
 the parser enters a scope, it pushes the node memory manager
 position.  After all definitions in the scope have been dealt
 with, the parser pops the node memory manager to reclaim all
 of the memory used.

 @item void yynodeclear([YYNODESTATE *state])
 @cindex yynodeclear function
 Clears the entire node memory manager and returns it to the
 state it had after calling @code{yynodeinit}.  This is typically
 used upon program shutdown to free all remaining node memory.

 @item void yynodefailed([YYNODESTATE *state])
 @cindex yynodefailed function
 Called when @code{yynodealloc} or @code{yynodepush} detects that
 the system is out of memory.  This function must be supplied by
 the programmer.  The programmer may choose to exit to program
 when the system is out of memory; in which case @code{yynodealloc}
 will never return @code{NULL}.
 @end table

 @c -----------------------------------------------------------------------

 @node C++ Language, Java Language, C Language, Output APIs
 @section C++ Language APIs
 @cindex C++ APIs

 In the C++ output language, each node type is converted into a @samp{class}
 that contains the node's fields, virtual operations, and other house-keeping
 definitions.  The following example demonstrates how treecc node declarations
 are converted into C++ source code:

 @example
 %node expression %abstract %typedef =
 @{
     %nocreate type_code type;
 @}
 %node binary expression %abstract =
 @{
     expression *expr1;
     expression *expr2;
 @}
 %node plus binary
 @end example

 becomes:

 @example
 class expression
 @{
 protected:

     int kind__;
     char *filename__;
     long linenum__;

 public:

     int getKind() const @{ return kind__; @}
     const char *getFilename() const @{ return filename__; @}
     int getLinenum() const @{ return linenum__; @}
     void setFilename(char *filename) @{ filename__ = filename; @}
     void setLinenum(long linenum) @{ linenum__ = linenum; @}

     void *operator new(size_t);
     void operator delete(void *, size_t);

 protected:

     expression();

 public:

     type_code type;

     virtual int isA(int kind) const;
     virtual const char *getKindName() const;

 protected:

     virtual ~expression();

 @};

 class binary : public expression
 @{
 protected:

     binary(expression * expr1, expression * expr2);

 public:

     expression * expr1;
     expression * expr2;

     virtual int isA(int kind) const;
     virtual const char *getKindName() const;

 protected:

     virtual ~binary();

 @};

 class plus : public binary
 @{
 public:

     plus(expression * expr1, expression * expr2);

 public:

     virtual int isA(int kind) const;
     virtual const char *getKindName() const;

 protected:

     virtual ~plus();

 @};
 @end example

 The following standard methods are available on every node type:

 @table @code
 @item int getKind()
 @cindex getKind method (C++)
 Gets the numeric kind value associated with a particular node.
 The kind value for node type @samp{NAME} is called @samp{NAME_kind}.

 @item virtual const char *getKindName()
 @cindex getKindName method (C++)
 Gets the name of the node kind associated with a particular node.
 This may be helpful for debugging and logging code.

 @item virtual int isA(int kind)
 @cindex isA method (C++)
 Determines if the node is a member of the node type that corresponds
 to the numeric kind value @samp{kind}.

 @item const char *getFilename()
 @cindex getFilename method (C++)
 Gets the filename corresponding to where the node was created
 during parsing.  This method is only generated if @samp{%option track_lines}
 was specified.

 @item long getLinenum()
 @cindex getLinenum method (C++)
 Gets the line number corresponding to where the node was created
 during parsing.  This method is only generated if @samp{%option track_lines}
 was specified.

 @item void setFilename(char *value)
 @cindex setFilename method (C++)
 Sets the filename associated with the node to @samp{value}.  The
 string is not copied, so @samp{value} must persist for the lifetime
 of the node.  This method will rarely be required, unless a node
 corresponds to a different line than the current parse line.  This
 method is only generated if @samp{%option track_lines} was specified.

 @item void setLinenum(long value)
 @cindex setLinenum method (C++)
 Sets the line number associated with the node to @samp{value}.
 This method will rarely be required, unless a node corresponds to a
 different line than the current parse line.  This method is only
 generated if @samp{%option track_lines} was specified.
 @end table

 If the generated code is non-reentrant, then the constructor for the
 class can be used to construct nodes of the specified node type.  The
 constructor parameters are the same as the fields within the node type's
 definition, except for @samp{%nocreate} fields.

 If the generated code is reentrant, then nodes cannot be constructed
 using the C++ @samp{new} operator.  The @samp{*Create} methods
 on the @samp{YYNODESTATE} factory class must be used instead.

 The @samp{YYNODESTATE} class contains a number of house-keeping methods
 that are used to manage nodes:

 @table @code
 @item static YYNODESTATE *getState()
 @cindex getState method (C++)
 Gets the global @samp{YYNODESTATE} instance that is being used by
 non-reentrant code.  If an instance has not yet been created,
 this method will create one.

 When using non-reentrant code, the programmer will normally subclass
 @samp{YYNODESTATE}, override some of the methods below, and then
 construct an instance of the subclass.  This constructed instance
 will then be returned by future calls to @samp{getState}.

 @item void *alloc(size_t size)
 @cindex alloc method (C++)
 Allocates a block of memory of @samp{size} bytes in size from the
 node memory manager.  This function is called automatically from
 the node-specific constructors and @samp{*Create} methods.  The programmer
 will not normally need to call this function.

 This function will return @code{NULL} if the system is out of
 memory, or if @samp{size} is too large to be allocated within
 the node memory manager.  If the system is out of memory, then
 @samp{alloc} will call @samp{failed} prior to returning @code{NULL}.

 @item int push()
 @cindex push method (C++)
 Pushes the current node memory manager position.  The next time
 @code{pop} is called, the node memory manager will reset to
 the pushed position.  This function returns zero if the system
 is out of memory.

 @item void pop()
 @cindex pop method (C++)
 Pops the current node memory manager position.  This function has
 no effect if @code{push} was not called previously.

 The @code{push} and @code{pop} methods can be used
 to perform a simple kind of garbage collection on nodes.  When
 the parser enters a scope, it pushes the node memory manager
 position.  After all definitions in the scope have been dealt
 with, the parser pops the node memory manager to reclaim all
 of the memory used.

 @item void clear()
 @cindex clear method (C++)
 Clears the entire node memory manager and returns it to the
 state it had after construction.

 @item virtual void failed()
 @cindex failed method (C++)
 Called when @code{alloc} or @code{push} detects that
 the system is out of memory.  This method is typically
 overridden by the programmer in subclasses.  The programmer may
 choose to exit to program when the system is out of memory; in
 which case @code{alloc} will never return @code{NULL}.

 @item virtual char *currFilename()
 @cindex currFilename method (C++)
 Get the name of the current input file from the parser.  The pointer
 that is returned from this function is stored as-is: the string is
 not copied.  Therefore, the value must persist for at least as long
 as the node will persist.  This method is usually overrriden by
 the programmer in subclasses if @samp{%option track_lines} was specified.

 @item virtual long currLinenum()
 @cindex currLinenum method (C++)
 Get the number of the current input line from the parser.  This
 method is usually overridden by the programmer in subclasses
 if @samp{%option track_lines} was specified.
 @end table

 The programmer will typically subclass @samp{YYNODESTATE} to provide
 additional functionality, and then create an instance of this class
 to act as the node memory manager and node creation factory.

 @c -----------------------------------------------------------------------

 @node Java Language, C# Language, C++ Language, Output APIs
 @section Java Language APIs
 @cindex Java APIs

 In the Java output language, each node type is converted into a @samp{class}
 that contains the node's fields, virtual operations, and other house-keeping
 definitions.  The following example demonstrates how treecc node declarations
 are converted into Java source code:

 @example
 %node expression %abstract %typedef =
 @{
     %nocreate type_code type;
 @}
 %node binary expression %abstract =
 @{
     expression expr1;
     expression expr2;
 @}
 %node plus binary
 @end example

 becomes:

 @example
 public class expression
 @{
     protected int kind__;
     protected String filename__;
     protected long linenum__;

     public int getKind() @{ return kind__; @}
     public String getFilename() @{ return filename__; @}
     public long getLinenum() const @{ return linenum__; @}
     public void setFilename(String filename) @{ filename__ = filename; @}
     public void setLinenum(long linenum) @{ linenum__ = linenum; @}

     public static final int KIND = 1;

     public type_code type;

     protected expression()
     @{
         this.kind__ = KIND;
         this.filename__ = YYNODESTATE.getState().currFilename();
         this.linenum__ = YYNODESTATE.getState().currLinenum();
     @}

     public int isA(int kind)
     @{
         if(kind == KIND)
             return 1;
         else
             return 0;
     @}

     public String getKindName()
     @{
         return "expression";
     @}
 @}

 public class binary extends expression
 @{
     public static final int KIND = 2;

     public expression expr1;
     public expression expr2;

     protected binary(expression expr1, expression expr2)
     @{
         super();
         this.kind__ = KIND;
         this.expr1 = expr1;
         this.expr2 = expr2;
     @}

     public int isA(int kind)
     @{
         if(kind == KIND)
             return 1;
         else
             return super.isA(kind);
     @}

     public String getKindName()
     @{
         return "binary";
     @}
 @}

 public class plus extends binary
 @{
     public static final int KIND = 3;

     public plus(expression expr1, expression expr2)
     @{
         super(expr1, expr2);
         this.kind__ = KIND;
     @}

     public int isA(int kind)
     @{
         if(kind == KIND)
             return 1;
         else
             return super.isA(kind);
     @}

     public String getKindName()
     @{
         return "plus";
     @}
 @}
 @end example

 The following standard members are available on every node type:

 @table @code
 @item int KIND
 @cindex KIND field (Java)
 The kind value for the node type corresponding to this class.

 @item int getKind()
 @cindex getKind method (Java)
 Gets the numeric kind value associated with a particular node.
 The kind value for node type @samp{NAME} is called @samp{NAME.KIND}.

 @item String getKindName()
 @cindex getKindName method (Java)
 Gets the name of the node kind associated with a particular node.
 This may be helpful for debugging and logging code.

 @item int isA(int kind)
 @cindex isA method (Java)
 Determines if the node is a member of the node type that corresponds
 to the numeric kind value @samp{kind}.

 @item String getFilename()
 @cindex getFilename method (Java)
 Gets the filename corresponding to where the node was created
 during parsing.  This method is only generated if @samp{%option track_lines}
 was specified.

 @item long getLinenum()
 @cindex getLinenum method (Java)
 Gets the line number corresponding to where the node was created
 during parsing.  This method is only generated if @samp{%option track_lines}
 was specified.

 @item void setFilename(String value)
 @cindex setFilename method (Java)
 Sets the filename associated with the node to @samp{value}.
 This method will rarely be required, unless a node corresponds to
 a different line than the current parse line.  This method is only
 generated if @samp{%option track_lines} was specified.

 @item void setLinenum(long value)
 @cindex setLinenum method (Java)
 Sets the line number associated with the node to @samp{value}.
 This method will rarely be required, unless a node corresponds to a
 different line than the current parse line.  This method is only
 generated if @samp{%option track_lines} was specified.
 @end table

 If the generated code is non-reentrant, then the constructor for the
 class can be used to construct nodes of the specified node type.  The
 constructor parameters are the same as the fields within the node type's
 definition, except for @samp{%nocreate} fields.

 If the generated code is reentrant, then nodes cannot be constructed
 using the Java @samp{new} operator.  The @samp{*Create} methods
 on the @samp{YYNODESTATE} factory class must be used instead.

 Enumerated types are converted into a Java @samp{class}:

 @example
 %enum JavaType =
 @{
     JT_BYTE,
     JT_SHORT,
     JT_CHAR,
     JT_INT,
     JT_LONG,
     JT_FLOAT,
     JT_DOUBLE,
     JT_OBJECT_REF
 @}
 @end example

 becomes:

 @example
 public class JavaType
 @{
     public static final int JT_BYTE = 0;
     public static final int JT_SHORT = 1;
     public static final int JT_CHAR = 2;
     public static final int JT_INT = 3;
     public static final int JT_LONG = 4;
     public static final int JT_FLOAT = 5;
     public static final int JT_DOUBLE = 6;
     public static final int JT_OBJECT_REF = 7;
 @}
 @end example

 References to enumerated types in fields and operation parameters
 are replaced with the type @samp{int}.

 Virtual operations are converted into public methods on the Java
 node classes.

 Non-virtual operations are converted into a static method within
 a class named for the operation.  For example,

 @example
 %operation void InferType::infer_type(expression e)
 @end example

 becomes:

 @example
 public class InferType
 @{
     public static void infer_type(expression e)
     @{
         ...
     @}
 @}
 @end example

 If the class name (@samp{InferType} in the above example) is omitted,
 then the name of the operation is used as both the class name and the
 the method name.

 The @samp{YYNODESTATE} class contains a number of house-keeping methods
 that are used to manage nodes:

 @table @code
 @item static YYNODESTATE getState()
 @cindex getState method (Java)
 Gets the global @samp{YYNODESTATE} instance that is being used by
 non-reentrant code.  If an instance has not yet been created,
 this method will create one.

 When using non-reentrant code, the programmer will normally subclass
 @samp{YYNODESTATE}, override some of the methods below, and then
 construct an instance of the subclass.  This constructed instance
 will then be returned by future calls to @samp{getState}.

 This method will not be present if a reentrant system is being
 generated.

 @item String currFilename()
 @cindex currFilename method (Java)
 Get the name of the current input file from the parser.  This method
 is usually overrriden by the programmer in subclasses if
 @samp{%option track_lines} was specified.

 @item long currLinenum()
 @cindex currLinenum method (Java)
 Get the number of the current input line from the parser.  This
 method is usually overridden by the programmer in subclasses
 if @samp{%option track_lines} was specified.
 @end table

 The programmer will typically subclass @samp{YYNODESTATE} to provide
 additional functionality, and then create an instance of this class
 to act as the global state and node creation factory.

 @c -----------------------------------------------------------------------

 @node C# Language, Ruby Language, Java Language, Output APIs
 @section C# Language APIs
 @cindex C# APIs

 In the C# output language, each node type is converted into a @samp{class}
 that contains the node's fields, virtual operations, and other house-keeping
 definitions.  The following example demonstrates how treecc node declarations
 are converted into C# source code:

 @example
 %node expression %abstract %typedef =
 @{
     %nocreate type_code type;
 @}
 %node binary expression %abstract =
 @{
     expression expr1;
     expression expr2;
 @}
 %node plus binary
 @end example

 becomes:

 @example
 public class expression
 @{
     protected int kind__;
     protected String filename__;
     protected long linenum__;

     public int getKind() @{ return kind__; @}
     public String getFilename() @{ return filename__; @}
     public long getLinenum() const @{ return linenum__; @}
     public void setFilename(String filename) @{ filename__ = filename; @}
     public void setLinenum(long linenum) @{ linenum__ = linenum; @}

     public const int KIND = 1;

     public type_code type;

     protected expression()
     @{
         this.kind__ = KIND;
         this.filename__ = YYNODESTATE.getState().currFilename();
         this.linenum__ = YYNODESTATE.getState().currLinenum();
     @}

     public virtual int isA(int kind)
     @{
         if(kind == KIND)
             return 1;
         else
             return 0;
     @}

     public virtual String getKindName()
     @{
         return "expression";
     @}
 @}

 public class binary : expression
 @{
     public const int KIND = 2;

     public expression expr1;
     public expression expr2;

     protected binary(expression expr1, expression expr2)
         : expression()
     @{
         this.kind__ = KIND;
         this.expr1 = expr1;
         this.expr2 = expr2;
     @}

     public override int isA(int kind)
     @{
         if(kind == KIND)
             return 1;
         else
             return base.isA(kind);
     @}

     public override String getKindName()
     @{
         return "binary";
     @}
 @}

 public class plus : binary
 @{
     public const int KIND = 5;

     public plus(expression expr1, expression expr2)
         : binary(expr1, expr2)
     @{
         this.kind__ = KIND;
     @}

     public override int isA(int kind)
     @{
         if(kind == KIND)
             return 1;
         else
             return base.isA(kind);
     @}

     public override String getKindName()
     @{
         return "plus";
     @}
 @}
 @end example

 The following standard members are available on every node type:

 @table @code
 @item const int KIND
 @cindex KIND field (C#)
 The kind value for the node type corresponding to this class.

 @item int getKind()
 @cindex getKind method (C#)
 Gets the numeric kind value associated with a particular node.
 The kind value for node type @samp{NAME} is called @samp{NAME.KIND}.

 @item virtual String getKindName()
 @cindex getKindName method (C#)
 Gets the name of the node kind associated with a particular node.
 This may be helpful for debugging and logging code.

 @item virtual int isA(int kind)
 @cindex isA method (C#)
 Determines if the node is a member of the node type that corresponds
 to the numeric kind value @samp{kind}.

 @item String getFilename()
 @cindex getFilename method (C#)
 Gets the filename corresponding to where the node was created
 during parsing.  This method is only generated if @samp{%option track_lines}
 was specified.

 @item long getLinenum()
 @cindex getLinenum method (C#)
 Gets the line number corresponding to where the node was created
 during parsing.  This method is only generated if @samp{%option track_lines}
 was specified.

 @item void setFilename(String value)
 @cindex setFilename method (C#)
 Sets the filename associated with the node to @samp{value}.
 This method will rarely be required, unless a node corresponds to
 a different line than the current parse line.  This method is only
 generated if @samp{%option track_lines} was specified.

 @item void setLinenum(long value)
 @cindex setLinenum method (C#)
 Sets the line number associated with the node to @samp{value}.
 This method will rarely be required, unless a node corresponds to a
 different line than the current parse line.  This method is only
 generated if @samp{%option track_lines} was specified.
 @end table

 If the generated code is non-reentrant, then the constructor for the
 class can be used to construct nodes of the specified node type.  The
 constructor parameters are the same as the fields within the node type's
 definition, except for @samp{%nocreate} fields.

 If the generated code is reentrant, then nodes cannot be constructed
 using the C# @samp{new} operator.  The @samp{*Create} methods
 on the @samp{YYNODESTATE} factory class must be used instead.

 Enumerated types are converted into a C# @samp{enum} definition:

 @example
 %enum JavaType =
 @{
     JT_BYTE,
     JT_SHORT,
     JT_CHAR,
     JT_INT,
     JT_LONG,
     JT_FLOAT,
     JT_DOUBLE,
     JT_OBJECT_REF
 @}
 @end example

 becomes:

 @example
 public enum JavaType
 @{
     JT_BYTE,
     JT_SHORT,
     JT_CHAR,
     JT_INT,
     JT_LONG,
     JT_FLOAT,
     JT_DOUBLE,
     JT_OBJECT_REF,
 @}
 @end example

 Virtual operations are converted into public virtual methods on the C#
 node classes.

 Non-virtual operations are converted into a static method within
 a class named for the operation.  For example,

 @example
 %operation void InferType::infer_type(expression e)
 @end example

 becomes:

 @example
 public class InferType
 @{
     public static void infer_type(expression e)
     @{
         ...
     @}
 @}
 @end example

 If the class name (@samp{InferType} in the above example) is omitted,
 then the name of the operation is used as both the class name and the
 the method name.

 The @samp{YYNODESTATE} class contains a number of house-keeping methods
 that are used to manage nodes:

 @table @code
 @item static YYNODESTATE getState()
 @cindex getState method (C#)
 Gets the global @samp{YYNODESTATE} instance that is being used by
 non-reentrant code.  If an instance has not yet been created,
 this method will create one.

 When using non-reentrant code, the programmer will normally subclass
 @samp{YYNODESTATE}, override some of the methods below, and then
 construct an instance of the subclass.  This constructed instance
 will then be returned by future calls to @samp{getState}.

 This method will not be present if a reentrant system is being
 generated.

 @item virtual String currFilename()
 @cindex currFilename method (C#)
 Get the name of the current input file from the parser.  This method
 is usually overrriden by the programmer in subclasses if
 @samp{%option track_lines} was specified.

 @item virtual long currLinenum()
 @cindex currLinenum method (C#)
 Get the number of the current input line from the parser.  This
 method is usually overridden by the programmer in subclasses
 if @samp{%option track_lines} was specified.
 @end table

 The programmer will typically subclass @samp{YYNODESTATE} to provide
 additional functionality, and then create an instance of this class
 to act as the global state and node creation factory.

 @c -----------------------------------------------------------------------

 @node Ruby Language, Full Expression Example, C# Language, Output APIs
 @section Ruby Language APIs
 @cindex Ruby APIs

 In the Ruby output language, each node type is converted into a
 @samp{class} that contains the node's fields, operations, and other
 house-keeping definitions.  The following example demonstrates how
 treecc node declarations are converted into Ruby source code:

 @example
 %node expression %abstract %typedef =
 @{
     %nocreate type_code type;
 @}
 %node binary expression %abstract =
 @{
     expression expr1;
     expression expr2;
 @}
 %node plus binary
 @end example

 becomes:

 @example
 class YYNODESTATE

   @@@@state = nil

   def YYNODESTATE.state
     return @@@@state unless @@@@state.nil?
     @@@@state = YYNODESTATE.new()
     return @@@@state
   end

   def intialize
      @@@@state = self
    end

   def currFilename
      return nil
   end

   def currLinenum
      return 0
   end
 end

 class Expression
   protected
   attr_reader :kind
   public

   attr_accessor :Linenum, :Filename

   attr_accessor :type

   KIND = 1

   def initialize()
     @@kind = KIND
     @@Filename = YYNODESTATE.state.currFilename()
     @@Linenum = YYNODESTATE.state.currLinenum()
   end

   def isA(kind)
     if(@@kind == KIND) then
       return true
     else
       return 0
     end
   end

   def KindName
     return "Expression"
   end

 end

 class Binary < Expression

   attr_accessor :expr1
   attr_accessor :expr2

   KIND = 2

   def initialize(expr1, expr2)
   super(expr1, expr2)
     @@kind = KIND
     self.expr1 = expr1
     self.expr2 = expr2
   end

   def isA(kind)
     if(@@kind == Kind) then
       return true
     else
       return super(kind)
     end
   end

   def KindName
     return "Binary"
   end

 end

 class Plus < Binary

   KIND = 3

   def initialize(expr1, expr2)
   super(expr1, expr2expr1, expr2)
     @@kind = KIND
   end

   def isA(kind)
     if(@@kind == KIND) then
       return true
     else
       return super(kind)
     end
   end

   def KindName
     return "Plus"
   end

 end
 @end example

 The following standard members are available on every node type:

 @table @code
 @item KIND
 @cindex KIND field (Ruby)
 The kind value for the node type corresponding to this class.
 The kind value for node type @samp{NAME} is called @samp{NAME::KIND}.

 @item KindName
 @cindex KindName field (Ruby)
 The name of the node kind associated with a particular node.  This may
 be helpful for debugging and logging code.

 @item isA(int kind)
 @cindex isA method (Ruby)
 Determines if the node is a member of the node type that corresponds
 to the numeric kind value @samp{kind}.

 @item Filename
 @cindex Filename field (Ruby)
 The filename corresponding to where the node was created during parsing.
 This method is only generated if @samp{%option track_lines} was
 specified.

 @item Linenum()
 @cindex Linenum field (Ruby)
 The line number corresponding to where the node was created during
 parsing.  This method is only generated if @samp{%option track_lines}
 was specified.

 @end table

 @c Don't know if this is true for ruby
 @ignore
 If the generated code is non-reentrant, then the constructor for the
 class can be used to construct nodes of the specified node type.  The
 constructor parameters are the same as the fields within the node type's
 definition, except for @samp{%nocreate} fields.

 If the generated code is reentrant, then nodes cannot be constructed
 using the C# @samp{new} operator.  The @samp{*Create} methods
 on the @samp{YYNODESTATE} factory class must be used instead.
 @end ignore

 Enumerated types are converted into a Ruby @samp{class} definition:

 @example
 %enum JavaType =
 @{
     JT_BYTE,
     JT_SHORT,
     JT_CHAR,
     JT_INT,
     JT_LONG,
     JT_FLOAT,
     JT_DOUBLE,
     JT_OBJECT_REF
 @}
 @end example

 becomes:

 @example

 class JavaType
   JT_BYTE = 0
   JT_SHORT = 1
   JT_CHAR = 2
   JT_INT = 3
   JT_LONG = 4
   JT_FLOAT = 5
   JT_DOUBLE = 6
   JT_OBJECT_REF = 7
 end

 @end example

 @c
 Virtual operations are converted into public methods on the Ruby
 node classes.

 Non-virtual operations are converted into a class method within
 a class named for the operation.  For example,

 @example
 %operation void InferType::infer_type(expression e)
 @end example

 becomes:

 @example
 class InferType
     def InferType.infer_type(expression e)
         ...
     end
 end
 @end example

 If the class name (@samp{InferType} in the above example) is omitted,
 then the name of the operation is used as both the class name and the
 the method name. You then get a method with a name starting with an
 uppercase letter. However, Ruby methods start with lowercase methods.
 So never forget the class name.

 The @samp{YYNODESTATE} class contains a number of house-keeping methods
 that are used to manage nodes:

 @table @code
 @item YYNODESTATE::state()
 @cindex state field (Ruby)
 Gets the global @samp{YYNODESTATE} instance that is being used by
 non-reentrant code.  If an instance has not yet been created,
 this method will create one.

 @c Don't know if the following is correct
 @ignore

 When using non-reentrant code, the programmer will normally subclass
 @samp{YYNODESTATE}, override some of the methods below, and then
 construct an instance of the subclass.  This constructed instance
 will then be returned by future calls to @samp{getState}.

 This method will not be present if a reentrant system is being
 generated.

 @end ignore

 @item currFilename
 @cindex currFilename field (Ruby)
 The name of the current input file from the parser.  This fields
 accessor method is usually overrriden by the programmer in subclasses if
 @samp{%option track_lines} was specified.

 @item currLinenum
 @cindex currLinenum field (Ruby)
 The number of the current input line from the parser.  This fields
 accessor method is usually overridden by the programmer in subclasses if
 @samp{%option track_lines} was specified.
 @end table

 The programmer will typically subclass @samp{YYNODESTATE} to provide
 additional functionality, and then create an instance of this class
 to act as the global state and node creation factory.

 @c -----------------------------------------------------------------------

 @node Full Expression Example, EBNF Syntax, Ruby Language, Top
 @appendix Full expression example code
 @cindex Full expression example

 The full treecc input file for the expression example is as follows:

 @example
 %enum type_code =
 @{
     int_type,
     float_type
 @}

 %node expression %abstract %typedef =
 @{
     %nocreate type_code type = @{int_type@};
 @}

 %node binary expression %abstract =
 @{
     expression *expr1;
     expression *expr2;
 @}

 %node unary expression %abstract =
 @{
     expression *expr;
 @}

 %node intnum expression =
 @{
     int num;
 @}

 %node floatnum expression =
 @{
     float num;
 @}

 %node plus binary
 %node minus binary
 %node multiply binary
 %node divide binary
 %node power binary
 %node negate unary

 %operation void infer_type(expression *e)

 infer_type(binary)
 @{
     infer_type(e->expr1);
     infer_type(e->expr2);

     if(e->expr1->type == float_type || e->expr2->type == float_type)
     @{
         e->type = float_type;
     @}
     else
     @{
         e->type = int_type;
     @}
 @}

 infer_type(unary)
 @{
     infer_type(e->expr);
     e->type = e->expr->type;
 @}

 infer_type(intnum)
 @{
     e->type = int_type;
 @}

 infer_type(floatnum)
 @{
     e->type = float_type;
 @}

 infer_type(power)
 @{
     infer_type(e->expr1);
     infer_type(e->expr2);

     if(e->expr2->type != int_type)
     @{
         error("second argument to `^' is not an integer");
     @}

     e->type = e->expr1->type;
 @}
 @end example

 The full yacc grammar is as follows:

 @example
 %union @{
     expression *node;
     int         inum;
     float       fnum;
 @}

 %token INT FLOAT

 %type <node> expr
 %type <inum> INT
 %type <fnum> FLOAT

 %%

 expr: INT               @{ $$ = intnum_create($1); @}
     | FLOAT             @{ $$ = floatnum_create($1); @}
     | '(' expr ')'      @{ $$ = $2; @}
     | expr '+' expr     @{ $$ = plus_create($1, $3); @}
     | expr '-' expr     @{ $$ = minus_create($1, $3); @}
     | expr '*' expr     @{ $$ = multiply_create($1, $3); @}
     | expr '/' expr     @{ $$ = divide_create($1, $3); @}
     | expr '^' expr     @{ $$ = power_create($1, $3); @}
     | '-' expr          @{ $$ = negate_create($2); @}
     ;
 @end example

 @c -----------------------------------------------------------------------

 @node EBNF Syntax, Index, Full Expression Example, Top
 @appendix EBNF syntax for treecc input files
 @cindex EBNF syntax

 The EBNF syntax for treecc input files uses the following
 lexical tokens:

 @example
 IDENTIFIER ::= <A-Za-z_> @{ <A-Za-z0-9_> @}

 STRING ::= '"' <anything that does not include '"'> '"'
          | "'" <anything that does not include "'"> "'"

 LITERAL_DEFNS ::= "%@{" <anything except "%@}"> "%@}"

 LITERAL_END ::= "%%" <any character sequence until EOF>

 LITERAL_CODE ::= '@{' <anything with matched '@{' and '@}'> '@}'
 @end example

 In addition, anything that begins with "%" in the following syntax
 is a lexical keyword.

 The EBNF syntax is as follows:

 @example
 File ::= @{ Declaration @}

 Declaration ::= Node
               | Operation
               | OperationCase
               | Option
               | Enum
               | Literal
               | Header
               | Output
               | Common
               | Include

 Node ::= %node IDENTIFIER [ IDENTIFIER ] @{ NodeFlag @} [ '=' Fields ]

 NodeFlag ::= %abstract | %typedef

 Fields ::= '@{' @{ Field @} '@}'

 Field ::= [ %nocreate ] TypeAndName [ '=' LITERAL_CODE ] ';'

 TypeAndName ::= Type [ IDENTIFIER ]

 Type ::= TypeName
        | Type '*'
        | Type '&'
        | Type '[' ']'

 TypeName ::= IDENTIFIER @{ IDENTIFIER @}

 Operation ::= %operation @{ OperFlag @} Type
                    [ ClassName ] IDENTIFIER '(' [ Params ] ')'
                    [ '=' LITERAL_CODE ] [ ';' ]

 OperFlag ::= %virtual | %inline | %split

 ClassName ::= IDENTIFIER "::"

 Params ::= Param @{ ',' Param @}

 Param ::= TypeAndName | '[' TypeAndName ']'

 OperationCase ::= OperationHead @{ ',' OperationHead @} LITERAL_CODE

 OperationHead ::= IDENTIFIER '(' [ TypeList ] ')'

 TypeList ::= IDENTIFIER @{ ',' IDENTIFIER @}

 Option ::= %option IDENTIFIER [ '=' Value ]

 Value ::= IDENTIFIER | STRING

 Enum ::= %enum IDENTIFIER '=' '@{' EnumBody [ ',' ] '@}'

 EnumBody ::= IDENTIFIER @{ ',' IDENTIFIER @}

 Literal ::= @{ LiteralFlag @} (LITERAL_DEFNS | LITERAL_END)

 LiteralFlag ::= %both | %decls | %end

 Header ::= %header STRING

 Output ::= %output STRING

 Common ::= %common

 Include ::= %include [ %readonly ] STRING

 @end example

 @c -----------------------------------------------------------------------

 @page

 @node Index, , EBNF Syntax, Top
 @unnumbered Index

 @printindex cp

 @contents
 @bye