
Data::Walk::Extracted - An extracted dataref walker

This is a contrived example! For more functional (complex/useful) examples see the roles in this package.
package Data::Walk::MyRole;
use Moose::Role;
requires '_process_the_data';
use MooseX::Types::Moose qw(
Str
ArrayRef
HashRef
);
my $mangle_keys = {
Hello_ref => 'primary_ref',
World_ref => 'secondary_ref',
};
#########1 Public Method 3#########4#########5#########6#########7#########8
sub mangle_data{
my ( $self, $passed_ref ) = @_;
@$passed_ref{ 'before_method', 'after_method' } =
( '_mangle_data_before_method', '_mangle_data_after_method' );
### Start recursive parsing
$passed_ref = $self->_process_the_data( $passed_ref, $mangle_keys );
### End recursive parsing with: $passed_ref
return $passed_ref->{Hello_ref};
}
#########1 Private Methods 3#########4#########5#########6#########7#########8
### If you are at the string level merge the two references
sub _mangle_data_before_method{
my ( $self, $passed_ref ) = @_;
if(
is_Str( $passed_ref->{primary_ref} ) and
is_Str( $passed_ref->{secondary_ref} ) ){
$passed_ref->{primary_ref} .= " " . $passed_ref->{secondary_ref};
}
return $passed_ref;
}
### Strip the reference layers on the way out
sub _mangle_data_after_method{
my ( $self, $passed_ref ) = @_;
if( is_ArrayRef( $passed_ref->{primary_ref} ) ){
$passed_ref->{primary_ref} = $passed_ref->{primary_ref}->[0];
}elsif( is_HashRef( $passed_ref->{primary_ref} ) ){
$passed_ref->{primary_ref} = $passed_ref->{primary_ref}->{level};
}
return $passed_ref;
}
package main;
use Modern::Perl;
use MooseX::ShortCut::BuildInstance qw(
build_instance
);
my $AT_ST = build_instance(
package => 'Greeting',
superclasses => [ 'Data::Walk::Extracted' ],
roles => [ 'Data::Walk::MyRole' ],
);
print $AT_ST->mangle_data( {
Hello_ref =>{ level =>[ { level =>[ 'Hello' ] } ] },
World_ref =>{ level =>[ { level =>[ 'World' ] } ] },
} ) . "\n";
#################################################################################
# Output of SYNOPSIS
# 01:Hello World
#################################################################################

This module takes a data reference (or two) and recursivly travels through it(them). Where the two references diverge the walker follows the primary data reference. At the beginning and end of each "node" the code will attempt to call a method using data from the current location of the node.
This is an implementation of the concept of extracted data walking from Higher-Order-Perl Chapter 1 by Mark Jason Dominus. The book is well worth the money! With that said I diverged from MJD purity in two ways. This is object oriented code not functional code. Second, like the MJD equivalent, the code does nothing on its own. Unlike the MJD equivalent it looks for methods provided in a role or class extention at the appropriate places for action. The MJD equivalent expects to use a passed CodeRef at the action points. There is clearly some overhead associated with both of these differences. I made those choices consciously and if that upsets you do not hassle MJD!
With the recursive part of data walking extracted the various functionalities desired when walking the data can be modularized without copying this code. The Moose framework also allows diverse and targeted data parsing without dragging along a kitchen sink API for every implementation of this Class.
All action taken during the data walking must be initiated by implementation of action methods that do not exist in this class. They can be added with a traditionally incorporated Role Moose::Role, by extending the class, or joined to the class later. See MooseX::ShortCut::BuildInstance. or Moose::Util for more class building information. See the "Recursive Parsing Flow" to understand the details of how the methods are used.
First build either or both of the before and after action methods. Then create the 'action' method for the role. This would preferably be named something descriptive like 'mangle_data'. Remember if more than one role is added to Data::Walk::Extracted then all methods should be named with consideration for other (future?) method names. The 'mangle_data' method should gather any action methods and data references into a $passed_ref the pass this reference and possibly a "$conversion_ref" to be used by _process_the_data . Then the 'action' method should call;
$passed_ref = $self->_process_the_data( $passed_ref, $conversion_ref );
See the "Recursive Parsing Flow" for the details of this action.
Finally, Write some tests for your role!

The class next checks for an available 'before_method'. Using the test;
exists $passed_ref->{before_method};
If the test passes then the next sequence is run.
$method = $passed_ref->{before_method};
$passed_ref = $self->$method( $passed_ref );
If the $passed_ref is modified by the 'before_method' then the recursive parser will parse the new ref and not the old one.
If the next node type is not skipped then a list is generated for all paths within that lower node. For example a 'HASH' node would generate a list of hash keys for that node. SCALARs are handled as a list with one element single element and UNDEFs are an empty list. If the list should be sorted then the list is sorted. ARRAYS are hard sorted. This means that the actual items in the (primary) passed data ref are permanantly sorted.
For each element a new $passed_ref is generated containing the data below that element. The down level secondary_ref is only constructed if it has a matching type/element to the primary ref. Matching for hashrefs is done by key matching only. Matching for arrayrefs is done by position exists testing only. No position content compare is done! Scalars are matched on content. The list of items generated for this element is as follows;
The current node list position is then documented using an internally managed key of the $passed_ref labeled 'branch_ref'. The array reference stored in branch_ref can be thought of as the stack trace that documents the node elements directly between the current position and the initial (or zeroth) level of the parsed primary data_ref. Past completed branches and future pending branches are not maintained. Each element of the branch_ref contains four positions used to describe the node and selections used to traverse that node level. The values in each sub position are;
[
ref_type, #The node reference type
the list item value or '' for ARRAYs,
#key name for hashes, scalar value for scalars
element sequence position (from 0),
#For hashes this is only relevent if sort_HASH is called
level of the node (from 0),
`#The zeroth level is the passed data ref
]
The down level ref is then passed as a new data set to be parsed and it starts at "Assess and implement the before_method".
When the values are returned from the recursion call the last branch_ref element is poped off and the returned data ref is used to replace the sub elements of the primary_ref and secondary_ref associated with that list element in the current level of the $passed_ref. If there are still pending items in the node element list then the program returns to "Iterate through each element" else it moves to "Assess and implement the after_method".
The class checks for an available 'after_method' using the test;
exists $passed_ref->{after_method};
If the test passes then the following sequence is run.
$method = $passed_ref->{after_method};
$passed_ref = $self->$method( $passed_ref );
If the $passed_ref is modified by the 'after_method' then the recursive parser will parse the new ref and not the old one.
The updated $passed_ref is passed back up to the next level.

Data passed to ->new when creating an instance. For modification of these attributes see "Public Methods". The ->new function will either accept fat comma lists or a complete hash ref that has the possible attributes as the top keys. Additionally some attributes that meet certain criteria can be passed to _process_the_data and will be adjusted for just the run of that method call. These are called one shot attributes. Nested calls to _process_the_data will be tracked and the attribute will remain in force until the parser returns to the calling 'one shot' level. Previous attribute values are restored after the 'one shot' attribute value expires.
The keys are only used if they match a node type identified by the function _extracted_ref_type. The value for the key can be anything, but if it is a CODEREF it will be treated as a sort function in perl. In general it is sorting a list of strings not the data structure itself. The sort will be applied as follows.
@node_list = sort $coderef @node_list
For the type 'ARRAY' the node is sorted (permanantly) as well as the list. This means that if the array contains a list of references it will effectivly sort in memory pointer order. Additionally the 'secondary_ref' node is not sorted, so prior alignment may break. In general ARRAY sorts are not recommended.
sorted_nodes =>{
ARRAY => 1,#Will sort the primary_ref only
HASH => sub{ $b cmp $a }, #reverse sort the keys
}
The keys are only used if they match a node type identified by the function _extracted_ref_type. The value for the key can be anything.
[
[ 'HASH', 'KeyWord', 'ANY', 'ANY'],
# Skip the node below the value of any hash key eq 'Keyword'
[ 'ARRAY', 'ANY', '3', '4'], ],
# Skip the nodes below arrays at position three on level four
]

$passed_ref ={
print_ref =>{
First_key => [
'first_value',
'second_value'
],
},
match_ref =>{
First_key => 'second_value',
},
before_method => '_print_before_method',
after_method => '_print_after_method',
sorted_nodes =>{ Array => 1 },#One shot attribute setter
}
$conversion_ref ={
primary_ref => 'print_ref',# generic_name => role_name,
secondary_ref => 'match_ref',
}
$ref = $self->_build_branch(
$seed_ref,
@{ $passed_ref->{branch_ref}},
);

Each branch point of a data reference is considered a node. The original top level reference is the 'zeroth' node. Recursion 'Base state' nodes are understood to have zero elements so an additional node called 'END' type is recognized after a scalar.
Support for Objects is partially implemented and as a consequence '_process_the_data' won't immediatly die when asked to parse an object. It will still die but on a dispatch table call that indicates where there is missing object support, not at the top of the node. This allows for some of the skip attributes to use 'OBJECT' in their definitions.
This class uses the role Data::Walk::Extracted::Dispatch to implement dispatch tables. When there is a decision point, that role is used to make the class extensible.

This is not an extention of Data::Walk
The core class has no external effect. All output comes from addtions to the class.
This module uses the 'defined or' ( //= ) and so requires perl 5.010 or higher.
This is a Moose based data handling class. Many coders will tell you Moose and data manipulation don't belong together. They are most certainly right in speed intensive circumstances.
Recursive parsing is not a good fit for all data since very deep data structures will burn a fair amount of perl memory! Meaning that as the module recursively parses through the levels perl leaves behind snapshots of the previous level that allow perl to keep track of it's location.
The primary_ref and secondary_ref are effectivly deep cloned during this process. To leave the primary_ref pointer intact see "fixed_primary"

The module uses Smart::Comments if the '-ENV' option is set. The 'use' is encapsulated in an if block triggered by an environmental variable to comfort non-believers. Setting the variable $ENV{Smart_Comments} in a BEGIN block will load and turn on smart comment reporting. There are three levels of 'Smartness' available in this module '###', '####', and '#####'.




This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
The full text of the license can be found in the LICENSE file included with this module.

