NAME

Data::Match - Complex data structure pattern matching

SYNOPSIS

  use Data::Match qw(:all);
  my ($match, $results) = match($structure, $pattern);

  use Data::Match;
  my $obj = new Data::Match;
  my ($match, $results) = $obj->execute($structure, $pattern);

DESCRIPTION

Data::Match provides extensible complex Perl data structure searching and matching.

EXPORT

None are exported by default. :func exports match and matches, :pat exports all the pattern element generators below, :all exports :func and :pat.

PATTERNS

A data pattern is a complex data structure that possibly matches another complex data structure. For example:

  matches([ 1, 2 ], [ 1, 2 ]); # TRUE

  matches([ 1, 2, 3 ], [ 1, ANY, 3 ]); # TRUE

  matches([ 1, 2, 3 ], [ 1, ANY, 2 ]); # FALSE: 3 != 2

ANY matches anything, including an undefined value.

  my $results = matches([ 1, 2, 1 ], [ BIND('x'), ANY, BIND('x') ]); # TRUE

BIND($name) matches anything and remembers each match and its position with every BIND($name) in $result-{'BIND'}{$name}>. If BIND($name) is not the same as the first value bound to BIND($name) it does not match. For example:

  my $results = matches([ 1, 2, 3 ], [ BIND('x'), 2, BIND('x') ]); # FALSE: 3 != 1

COLLECT($name) is similar to BIND but does not compare first bound values.

REST matches all remaining elements of an array or hash.

  matches([ 1, 2, 3 ], [ 1, REST() ]); # TRUE
  matches({ 'a'=>1, 'b'=>1 }, { 'b'=>1, REST() => REST() }); # TRUE

FIND searches at all depths for matching sub-patterns.

  matches([ 1, [ 1, 2 ], 3], FIND(COLLECT('x', [ 1, REST() ])); # is true.

See the test script t/t1.t in the package distribution for more pattern examples.

MATCH COLLECTIONS

When a BIND or COLLECT matches a datum, an entry is collected in $result->{BIND} and $result->{COLLECT}, respectively. (This might change in the future)

Each entry for the binding name is a hash containing 'v', 'p' and 'ps' lists.

'v'

is a list of the value at each match.

'p'

is a list of match paths describing where the corresponding match was found based on the root of the search at each match. See match_path_*. 'p' is not collected if $matchobj-gt{'no_collect_path'}.

'ps'

is a list of code strings (match_path_str) that describes where the match was for each match. 'ps' is collected only if $matchobj-gt{'collect_path_str'}.

SUB-PATTERNS

All patterns can have sub-patterns. Most patterns match the AND-ed results of their sub-patterns and their own behavior, first trying the sub-patterns before attempting to match the intrinsic behavior. However, OR and ANY match any sub-patterns;

For example:

  match([ ['a', 1 ], ['b', 2], ['a', 3] ], EACH(COLLECT('x', ['a', ANY() ]))) # TRUE

The above pattern means:

    For EACH element in the root structure (an array):

      COLLECT each element, into collection named 'x', that is,

        An ARRAY of length 2 that starts with 'a'.

On the other hand.

  match( [ ['a', 1 ], ['b', 2], ['a', 3] ], ALL(COLLECT('x', [ 'a', ANY() ])) ) 
  # IS FALSE

Because the second root element (an array) does not start with 'a'. But,

  match( [ ['a', 1 ], ['a', 2], ['a', 3] ], ALL(COLLECT('x', [ 'a', ANY() ])) ) 
  # IS TRUE

The pattern below flattens the nested array into atoms:

  match(
    [ 1, 'x', 
      [ 2, 'x', 
        [ 3, 'x'], 
        [ 4, 
           [ 5, 
             [ 'x' ] 
           ],
          6
        ] 
      ] 
    ], 
    FIND(COLLECT('x', EXPR(q{! ref}))), 
    { 'no_collect_path' => 1 }
  )->{'COLLECT'}{'x'}{'v'};

no_collect_path causes COLLECT and BIND to not collect any paths.

MATCH SLICES

Match slices are objects that contain slices of matched portions of a data structure. This is useful for inflicting change into substructures matched by patterns like REST.

For example:

  do {
    my $a = [ 1, 2, 3, 4 ];
    my $p = [ 1, ANY, REST(BIND('s')) ];
    my $r = matches($a, $p);
    ok($r);                                           # TRUE
    ok(Compare($r->{'BIND'}{'s'}{'v'}[0], [ 3, 4 ])); # TRUE
    $r->{'BIND'}{'s'}{'v'}[0][0] = 'x';               # Change match slice
    matches($a, [ 1, 2, 'x', 4 ]);                    # TRUE
  }

Hash match slices are generated for each key-value pair for a hash matched by EACH and ALL. Each of these match slices can be matched as a hash with a single key-value pair.

Match slices are useful for search and replace missions.

VISITATION ADAPTERS

By default Data::Match is blind to Perl object interfaces. To instruct Data::Match to not traverse object implementation containers and honor object interfaces you must provide a visitation adapter. A visitation adapter tells Data::Match how to traverse through an object interface and how to keep track of how it got through.

For example:

  package Foo;
  sub new
  {
    my ($cls, %opts) = @_;
    bless \%opts, $cls;
  }
  sub x { shift->{x}; }
  sub parent { shift->{parent}; }
  sub children { shift->{children}; }
  sub add_child { 
    my $self = shift; 
    for my $c ( @_ ) { 
      $c->{parent} = $self;
    }
    push(@{$self->{children}}, @_);
  }


  my $foos = [ map(new Foo('x' => $_), 1 .. 10) ];
  for my $f ( @$foos ) { $f->add_child($foos->[rand($#$foo)); }

  my $pat = FIND(COLLECT('Foo', ISA('Foo', { 'parent' => $foos->[0], REST() => REST() })));
  $match->match($foos, $pat);

The problem with the above example is: FIND will not honor the interface of class Foo by default and will eventually find a Foo where $_>parent eq $foos->[0] through all the parent and child links in the objects' implementation container. To force Data::Match to honor an interface (or a subset of an interface) during FIND traversal we create a 'find' adapter sub that will do the right thing.

  my $opts = {
    'find' => {
       'Foo' => sub {
         my ($self, $visitor, $match) = @_;

         # Always do 'x'.
         $visitor->($self->x, 'METHOD', 'x');

         # Optional children traversal.
         if ( $match->{'Foo_find_children'} ) {
           $visitor->($self->children, 'METHOD', 'children');
         }

         # Optional parent traversal.
         if ( $match->{'Foo_find_parent'} ) {
           $visitor->($self->parent, 'METHOD', 'parent');
         }
       }
     }
  }
  my $match = new Data::Match($opts, 'Foo_find_children' => 1);
  $match = $match->execute($foos, $pat);

See t/t4.t for more examples of visitation adapters.

DESIGN

Data::Match employs a mostly-functional external interface since this module was inspired by a Lisp tutorial ("The Little Lisper", maybe) I read too many years ago; besides, pattern matching is largely recursively functional. The optional control hashes and traverse adapter interfaces are better represented by an object interface so I implemented a functional veneer over the core object interface.

Internally, objects are used to represent the pattern primitives because most of the pattern primitives have common behavior. There are a few design patterns that are particularly applicable in Data::Match: Visitor and Adapter. Adapter is used to provide the extensibility for the traversal of blessed structures such that Data::Match can honor the external interfaces of a class and not blindly violate encapsulation. Visitor is the basis for some of the FIND pattern implementation. The Data::Match::Slice classes that provide the match slices are probably a Veneer on the array and hash types through the tie meta-behaviors.

CAVEATS

  • Does not have regexp-like operators like '?', '*', '+'.

  • Should probably have more interfaces with Data::DRef and Data::Walker.

  • The visitor adapters do not use UNIVERSAL::isa to search for the adapter; it uses ref. This will be fixed in a future release.

  • Since hash keys do not retain blessedness (what was Larry thinking?) it is difficult to have patterns match keys without resorting to some bizarre regexp instead of using isa.

  • match_path_set and match_path_ref do not work through 'METHOD' path boundaries. This will be fixed in a future release.

  • BIND and COLLECT need scoping operators for deeply collected patterns.

STATUS

If you find this to be useful please contact the author. This is alpha software; all APIs, semantics and behaviors are subject to change.

INTERFACE

This section describes the external interface of this module.

%match_opts

Default options for match.

execute

Matches a structure against a pattern. In a list context, returns both the match success and results; in a scalar context returns the results hash if match succeeded or undef.

  use Data::Match;
  my $obj = new Data::Match();
  my $matched = $obj->execute($thing, $pattern);

match

   use Data::Match qw(match);
   match($thing, $pattern, @opts)

is equivalent to:

   use Data::Match;
   Data::Match->new(@opts)->execute($thing, $pattern);

matches

Same as match in scalar context.

match_path_str

Returns a perl expression that will generate code to point to the element of the path.

  $matchobj->match_path_str($path, $str);

$str defaults to '$_'.

match_path_DRef_path

Returns a string suitable for Data::DRef.

  $matchobj->match_path_DRef_path($path, $str, $sep);

$str is used as a prefix for the Data::DRef path. $str defaults to ''; $sep defaults to $Data::DRef::Separator or '.';

match_path_get

Returns the value pointing to the location for the match path in the root.

  $matchobj->match_path_get($path, $root);

$root defaults to $matchobj-gt{'root'};

Example:

  my $results = matches($thing, FIND(BIND('x', [ 'x', REST ])));
  my $x = $results->match_path_get($thing, $results->{'BIND'}{'x'}{'p'}[0]);

The above example returns the first array that begins with 'x'.

match_path_set

Returns the value pointing to the location for the match path in the root.

  $matchobj->match_path_set($path, $value, $root);

$root defaults to $matchobj-gt{'root'};

Example:

  my $results = matches($thing, FIND(BIND('x', [ 'x', REST ])));
  $results->match_path_set($thing, $results->{'BIND'}{'x'}{'p'}[0], 'y');

The above example replaces the first array found that starts with 'x' with 'y';

match_path_ref

Returns a scalar ref pointing to the location for the match path in the root.

  $matchobj->match_path_ref($path, $root);

$root defaults to $matchobj-gt{'root'};

Example:

  my $results = matches($thing, FIND(BIND('x', [ 'x', REST ])));
  my $ref = $results->match_path_ref($thing, $results->{'BIND'}{'x'}{'p'}[0]);
  $$ref = 'y';

The above example replaces the first array that starts with 'x' with 'y';

VERSION

Version 0.05, $Revision: 1.12 $.

AUTHOR

Kurt A. Stephens <ks.perl@kurtstephens.com>

COPYRIGHT

Copyright (c) 2001, 2002 Kurt A. Stephens and ION, INC.

SEE ALSO

perl, Array::PatternMatcher, Data::Compare, Data::Dumper, Data::DRef, Data::Walker.

1 POD Error

The following errors were encountered while parsing the POD:

Around line 79:

You forgot a '=back' before '=head1'

You forgot a '=back' before '=head1'