The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.
=head1 NAME

Parse::Eyapp::treematchingtut - Tree Matching and Tree substitution: an introduction

=head1 TREE MATCHING AND TREE SUBSTITUTION

Most of the examples in this section can be found in the directory
C<examples/MatchingTrees> that comes with the distribution of L<Parse::Eyapp>.

=head2 Matching Trees

Both the transformation objects in C<Parse::Eyapp::YATW>
and the nodes in C<Parse::Eyapp::Node> have a method 
named C<m> for matching. 
For a C<Parse::Eyapp::YATW> object, the method -when called
in a list context- returns a list of 
C<Parse::Eyapp::Node::Match> nodes. 

                    @R = $t->m($yatw1, $yatw2, $yatw3, ...)

A C<Parse::Eyapp::Node::Match> 
object describes 
the nodes of the actual tree that have matched.
The nodes in the returned list are organized in a hierarchy.
They appear in the list 
sorted according to a depth-first visit of the actual tree C<$t>.
In a scalar context C<m> returns the first element of
the list.

Let us denote by C<$t> the actual tree being searched
and C<$r> one of the C<Parse::Eyapp::Node::Match>
nodes in the resulting forest C<@R>.
Then we have the following methods: 

=over 

=item *
The method C<$r-E<gt>node> return the node C<$t> of the actual 
tree that matched

=item *
The method C<$r-E<gt>father> returns the father of C<$r>
in the matching forest.
The father of C<$r> is defined by this property:
C<$r-E<gt>father-E<gt>node> is the nearest ancestor of
C<$r-E<gt>node> that matched with the treeregexp pattern.
That is, there is no ancestor that matched between
C<$r-E<gt>node> and C<$r-E<gt>father-E<gt>node>.
Otherwise C<$r-E<gt>father> is C<undef>

=item *

The method C<$r-E<gt>coord> returns the coordinates of C<$r-E<gt>node> 
relative to C<$t>.
For example, the coordinate C<".1.3.2"> 
denotes the node C<$t-E<gt>child(1)-E<gt>child(3)-E<gt>child(2)>, where C<$t>
is the root of the search.

=item *

The method C<$r-E<gt>depth> returns the depth of C<$r-E<gt>node> 
in C<$t>.

=item * 

When C<m> was called as a C<Parse::Eyapp::Node> method, i. e. 
with potentially more than one C<YATW> treeregexp, the method C<$r-E<gt>names>
returns the array of names of the transformations that matched with
C<$r-E<gt>node>.

=back

=head3 Use of C<m> as a L<Parse::Eyapp::Node> Method

The example in C<examples/MatchingTrees/m2.pl> shows the use of C<m> as
a C<Parse::Eyapp::Node> method.

  examples/MatchingTrees$ cat -n m2.pl
     1  #!/usr/bin/perl -w
     2  use strict;
     3  use Rule6;
     4  use Parse::Eyapp::Treeregexp;
     5
     6  Parse::Eyapp::Treeregexp->new( STRING => q{
     7    fold: /TIMES|PLUS|DIV|MINUS/(NUM, NUM)
     8    zxw: TIMES(NUM($x), .) and { $x->{attr} == 0 }
     9    wxz: TIMES(., NUM($x)) and { $x->{attr} == 0 }
    10  })->generate();
    11
    12  # Syntax analysis
    13  my $parser = new Rule6();
    14  my $input = "0*0*0";
    15  my $t = $parser->Run(\$input);
    16  print "Tree:",$t->str,"\n";
    17
    18  # Search
    19  my $m = $t->m(our ($fold, $zxw, $wxz));
    20  print "Match Node:\n",$m->str,"\n";


When executed with input C<0*0*0> the program generates this output:

  examples/MatchingTrees$ m2.pl
  Tree:TIMES(TIMES(NUM(TERMINAL),NUM(TERMINAL)),NUM(TERMINAL))
  Match Node:
  Match[[TIMES:0:wxz]](Match[[TIMES:1:fold,zxw,wxz]])

The representation of C<Match> nodes by C<str> deserves a comment.
C<Match> nodes have their own C<info> method. It returns a string
containing the concatenation of the class of C<$r-E<gt>node> 
(i.e. the actual node that matched), the depth
(C<$r-E<gt>depth>) and the names of the transformations
that matched (as provided by the method C<$r-E<gt>names>) 

=head3 Use of C<m> as a L<Parse::Eyapp::YATW> Method

A second example can be found 
inside the file C<examples/typechecking/Simple-Types-XXX.tar.gz>.
It illustrates a use of C<m> as 
a C<Parse::Eyapp:YATW> method.
It solves a problem of scope analysis in a C compiler:
matching each C<RETURN> statement with the function
that surrounds it. The parsing was already done, the 
AST was built and left in C<$t>. The treeregexp used 
(see C<lib/Simple/Trans.trg>) is:

  retscope: /FUNCTION|RETURN/

and the code that solves the problem (see
subroutine C<compile> in file C<lib/Simple/Types.eyp>
is:

 # Associate each "return exp" with its "function"
 my @returns = $retscope->m($t); 
 for (@returns) {
   my $node = $_->node;
   if (ref($node) eq 'RETURN') {
     my $function = $_->father->node; 
     $node->{function}  = $function;  
     $node->{t} = $function->{t};
   }
 }

The first line gets a list of C<Parse::Eyapp::Node::Match> nodes 
describing  the actual nodes that matched C</FUNCTION|RETURN/>. 
If the node described by C<$_> is a C<'RETURN'> node,
the expresion C< $_-E<gt>father-E<gt>node> must necessarily point
to the function node that encloses it. 

=head2  The C<SEVERITY> option of C<Parse::Eyapp::Treeregexp::new>

The C<SEVERITY> option of C<Parse::Eyapp::Treeregexp::new> controls the
way matching succeeds regarding the number of children.
To illustrate its use let us consider the following example.
The grammar used C<Rule6.yp> is similar
to the example in the section L<Parse::Eyapp::Node/SYNOPSIS>.

  examples/MatchingTrees$ cat -n numchildren.pl
     1  #!/usr/bin/perl -w
     2  use strict;
     3  use Rule6;
     4  use Parse::Eyapp::Treeregexp;
     5
     6  sub TERMINAL::info { $_[0]{attr} }
     7
     8  my $severity = shift || 0;
     9  my $input = shift || '0*2';
    10
    11  my $parser = new Rule6();
    12  my $t = $parser->Run(\$input);
    13
    14  my $transform = Parse::Eyapp::Treeregexp->new(
    15    STRING => q{
    16      zero_times_whatever: TIMES(NUM($x)) and { $x->{attr} == 0 } => { $_[0] = $NUM }
    17    },
    18    SEVERITY => $severity,
    19    FIRSTLINE => 14,
    20  )->generate;
    21
    22  $t->s(our @all);
    23
    24  print $t->str,"\n";

The program gets the severity level from the command line (line 9).
The specification of the term C<TIMES(NUM($x))> inside the
transformation C<zero_times_whatever> does not
clearly state that C<TIMES> must have two children.
There are several interpretations of the treregexp depending
on the level fixed for C<SEVERITY>:

=over

=item *
0: C<TIMES> must have at least one child. Don't care if it has more.

=item *
1: C<TIMES> must have exactly one child.

=item *
2: C<TIMES> must have exactly one child. When visit a 
C<TIMES> node with a different number of children issue a warning.

=item *
3: C<TIMES> must have exactly one child.  When visit a
C<TIMES> node with a different number of children issue an
error. 

=back

Observe the change in behavior according to the level of C<SEVERITY>:

  pl@nereida:~/LEyapp/examples/MatchingTrees$ numchildren.pl 0 '0*2'
  NUM(TERMINAL[0])
  pl@nereida:~/LEyapp/examples/MatchingTrees$ numchildren.pl 1 '0*2'
  TIMES(NUM(TERMINAL[0]),NUM(TERMINAL[2]))
  pl@nereida:~/LEyapp/examples/MatchingTrees$ numchildren.pl 2 '0*2'
  Warning! found node TIMES with 2 children.
  Expected 1 children (see line 15 of ./numchildren.pl)"
  TIMES(NUM(TERMINAL[0]),NUM(TERMINAL[2]))
  pl@nereida:~/LEyapp/examples/MatchingTrees$ numchildren.pl 3 '0*2'
  Error! found node TIMES with 2 children.
  Expected 1 children (see line 15 of ./numchildren.pl)"
   at (eval 3) line 29


=head2 Tree Substitution: The C<s> methods

Both C<Parse::Eyapp:Node> and C<Parse::Eyapp::YATW> objects (i.e.
nodes and tree transformations) are provided with a C<s> method.

In the case of a C<Parse::Eyapp::YATW> object the method C<s>
applies the tree transformation using a single bottom-up traversing:
the transformation is recursively applied to the children and 
then to the current node.

For C<Parse::Eyapp:Node> nodes the set of transformations is applied
to each node until no transformation matches any more.
The example in the section L<Parse::Eyapp::Node/SYNOPSIS> illustrates the use:

  1  # Let us transform the tree. Define the tree-regular expressions ..
  2  my $p = Parse::Eyapp::Treeregexp->new( STRING => q{
  3    { #  Example of support code
  4      my %Op = (PLUS=>'+', MINUS => '-', TIMES=>'*', DIV => '/');
  5    }
  6    constantfold: /TIMES|PLUS|DIV|MINUS/:bin(NUM($x), NUM($y))
  7      => {
  8        my $op = $Op{ref($_[0])};
  9        $x->{attr} = eval  "$x->{attr} $op $y->{attr}";
 10        $_[0] = $NUM[0];
 11      }
 12    uminus: UMINUS(NUM($x)) => { $x->{attr} = -$x->{attr}; $_[0] = $NUM }
 13    zero_times_whatever: TIMES(NUM($x), .) and { $x->{attr} == 0 } => { $_[0] = $NUM }
 14    whatever_times_zero: TIMES(., NUM($x)) and { $x->{attr} == 0 } => { $_[0] = $NUM }
 15    },
 16    OUTPUTFILE=> 'main.pm'
 17  );
 18  $p->generate(); # Create the tranformations
 19 
 20  $t->s($uminus); # Transform UMINUS nodes
 21  $t->s(@all);    # constant folding and mult. by zero

The call at line 20 can be substituted by C<$uminus-E<gt>s($t)>
without changes.


=head1 SEE ALSO

=over

=item * The project home is at L<http://code.google.com/p/parse-eyapp/>.
Use a subversion client to anonymously check out the latest project source code:

   svn checkout http://parse-eyapp.googlecode.com/svn/trunk/ parse-eyapp-read-only

=item * The tutorial I<Parsing Strings and Trees with> C<Parse::Eyapp>
(An Introduction to Compiler Construction in seven pages) in
L<http://nereida.deioc.ull.es/~pl/eyapsimple/> 

=item * 
L<Parse::Eyapp>, 
L<Parse::Eyapp::eyapplanguageref>, 
L<Parse::Eyapp::debuggingtut>,
L<Parse::Eyapp::defaultactionsintro>,
L<Parse::Eyapp::translationschemestut>,
L<Parse::Eyapp::Driver>,
L<Parse::Eyapp::Node>,
L<Parse::Eyapp::YATW>,
L<Parse::Eyapp::Treeregexp>,
L<Parse::Eyapp::Scope>,
L<Parse::Eyapp::Base>,
L<Parse::Eyapp::datagenerationtut>


=item * The pdf file in L<http://nereida.deioc.ull.es/~pl/perlexamples/languageintro.pdf> 

=item * The pdf file in L<http://nereida.deioc.ull.es/~pl/perlexamples/debuggingtut.pdf> 

=item * The pdf file in L<http://nereida.deioc.ull.es/~pl/perlexamples/eyapplanguageref.pdf> 

=item * The pdf file in L<http://nereida.deioc.ull.es/~pl/perlexamples/Treeregexp.pdf> 

=item * The pdf file in L<http://nereida.deioc.ull.es/~pl/perlexamples/Node.pdf> 

=item * The pdf file in L<http://nereida.deioc.ull.es/~pl/perlexamples/YATW.pdf> 

=item * The pdf file in L<http://nereida.deioc.ull.es/~pl/perlexamples/Eyapp.pdf> 

=item * The pdf file in L<http://nereida.deioc.ull.es/~pl/perlexamples/Base.pdf> 

=item * The pdf file in L<http://nereida.deioc.ull.es/~pl/perlexamples/translationschemestut.pdf> 

=item * The pdf file in L<http://nereida.deioc.ull.es/~pl/perlexamples/treematchingtut.pdf> 

=item *
perldoc L<eyapp>, 

=item *
perldoc L<treereg>,

=item *
perldoc L<vgg>,

=item * The Syntax Highlight file for vim at L<http://www.vim.org/scripts/script.php?script_id=2453>
and L<http://nereida.deioc.ull.es/~vim/>

=item * I<Analisis Lexico y Sintactico>, (Notes for a course in compiler 
construction) by  Casiano Rodriguez-Leon. 
Available at  L<http://nereida.deioc.ull.es/~pl/perlexamples/>
Is the more complete and reliable source for Parse::Eyapp. However is in Spanish.

=item *
L<Parse::Yapp>,

=item *
Man pages of yacc(1) and
bison(1),
L<http://www.delorie.com/gnu/docs/bison/bison.html>

=item *
L<Language::AttributeGrammar>

=item *
L<Parse::RecDescent>.

=item *
L<HOP::Parser>

=item *
L<HOP::Lexer>

=item * ocamlyacc tutorial at 
L<http://plus.kaist.ac.kr/~shoh/ocaml/ocamllex-ocamlyacc/ocamlyacc-tutorial/ocamlyacc-tutorial.html>

=back

=head1 REFERENCES

=over

=item *
The classic Dragon's book I<Compilers: Principles, Techniques, and Tools> 
by Alfred V. Aho, Ravi Sethi and
Jeffrey D. Ullman (Addison-Wesley 1986)

=item *
I<CS2121: The Implementation and Power of Programming Languages>
(See L<http://www.cs.man.ac.uk/~pjj>, L<http://www.cs.man.ac.uk/~pjj/complang/g2lr.html> 
and L<http://www.cs.man.ac.uk/~pjj/cs2121/ho/ho.html>) by 
Pete Jinks

=back


=head1 CONTRIBUTORS

=over 2

=item * Hal Finkel L<http://www.halssoftware.com/> 

=item * G. Williams L<http://kasei.us/>

=item * Thomas L. Shinnick L<http://search.cpan.org/~tshinnic/>

=item * Frank Leray

=back

=head1 AUTHOR
 
Casiano Rodriguez-Leon (casiano@ull.es)

=head1 ACKNOWLEDGMENTS

This work has been supported by CEE (FEDER) and the Spanish Ministry of
I<Educacion y Ciencia> through I<Plan Nacional I+D+I> number TIN2005-08818-C04-04
(ULL::OPLINK project L<http://www.oplink.ull.es/>). 
Support from Gobierno de Canarias was through GC02210601
(I<Grupos Consolidados>).
The University of La Laguna has also supported my work in many ways
and for many years.

A large percentage of  code is verbatim taken from L<Parse::Yapp> 1.05.
The author of L<Parse::Yapp> is Francois Desarmenien.
 
I wish to thank Francois Desarmenien for his L<Parse::Yapp> module, 
to my students at La Laguna and to the Perl Community. Thanks to 
the people who have contributed to improve the module (see L<Parse::Eyapp/CONTRIBUTORS>).
Thanks to Larry Wall for giving us Perl.
Special thanks to Juana.

=head1 LICENCE AND COPYRIGHT
 
Copyright (c) 2006-2008 Casiano Rodriguez-Leon (casiano@ull.es). All rights reserved.

Parse::Yapp copyright is of Francois Desarmenien, all rights reserved. 1998-2001
 
These modules are free software; you can redistribute it and/or
modify it under the same terms as Perl itself. See L<perlartistic>.
 
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.