NAME
Data::Nested - routines to work with a perl nested data structure
SYNOPSIS
use Data::Nested;
$obj = new Data::Nested;
DESCRIPTION
This module contains methods for working with a perl nested data
structure (NDS). Before using this module, it is assumed that the
programmer is completely familiar with perl data structures. If this is
not the case, this module will be of very limited use. Some suggested
reading to become familiar with perl data structures is included below
in the SEE ALSO section.
A data structure may consist of any number of nested perl data types
including:
lists
hashes
scalars
other (everything else)
This module can easily perform the following operations:
Access parts of the NDS
It is very easy to get a value stored somewhere in an NDS, or to set
a value somewhere in an NDS.
Verify structural integrity
Often, a data structure may have constraints on it (certain parts of
it may be lists, hashes, or scalars). This module can enforce those
constraints when setting parts of the NDS.
Merge multiple NDSs into a single NDS
Two different NDSs may be merged into a single NDS using a series of
rules (described below).
A reasonably complete set of examples for how to do these and other
tasks is included below.
ACCESSING AN NDS
Typically, when accessing a nested data structure, you might use
something like:
$nds{foo}[2]{bar}
Although this is very direct, there are some distinct problems with
this.
Structural information must be built into the script
Accessing a nested data structure directly means that the structure
of the object is hard coded into the script. Although this is often
a useful requirement, it makes changing the structure later on quite
difficult.
If the structural information can be "hidden" from the program, it
makes changing the structure at a later date easier.
Structure creation side effects
Often, you may want to access a structure which doesn't entirely
exist. To do so, you either have to recurse into the structure, or
you end up creating parts of the structure. For example, if you have
the following two lines:
%nds = ();
if (exists $nds{foo}[2]{bar}) { ... }
the %nds structure will now be:
%nds = (foo -> [ undef, undef, {} ] )
so a lot of structure was created that didn't exist previously.
Error handling
If there is a possibility that the reference is not correctly (or
completely) defined, and if you need error handling to handle this
case, extra code has to be written to recurse through the structure,
verifying the data as you go.
Repetitiveness
Many tasks, including error checking, recursing through a structure,
validity checking, etc., are very repetitive. They are necessary any
time you write a robust script that handles nested data structures.
This module will simplify the handling of an NDS. It can be used to
access the value (or substructure) stored somewhere in an NDS with a
call to a method which will automatically check that the structure is
correct. It can also be used to set, delete, check, or merge parts of an
NDS, along with other useful operations. For example, instead of
accessing a data structure directly as:
$nds{foo}[2]{bar}
it could be accessed as:
$obj->value($nds,"/foo/2/bar");
Here, the string "/foo/2/bar"is called a path. It is a series of indices
separated by a delimiter (which defaults to "/", but which can be set to
other values using the delim method described below). The indices of the
path describe how to traverse through an NDS.
NDS STRUCTURE
By default, every Data::Nested object will have a structural description
associated with it. This can be explicitly turned off (using the
no_structure method below), but doing so will disable much of the
functionality of this module.
By default, this module will determine the structural information by
examining the structure of an NDS itself, which means that the
programmer does not have to do anything to gain this functionality.
Alternately, the module can be used to explicitly specify the structure
which applies to the NDS. In this case, the data structure is required
to match that description. This automatically does the appropriate error
checking necessary to ensure that the structure is correct.
The data stored at every path in a data structure is of a certain type
and has certain structural characteristics. Some structural
characteristics imply or prohibit other structural characteristics.
The primary structural characteristic is the type of data stored at a
given path. As mentioned above, this module recognizes 4 types of data:
lists
hashes
scalars
other (everything else)
The two other structural characteristics that are used by this module
are whether elements are uniform or not, and whether elements are
ordered or not.
Uniform/non-uniform
The uniform/non-uniform characteristic applies to lists and hashes.
A uniform list has elements which are all the same structure. It is
not required that all elements have every piece of the structure,
but two uniform elements cannot have a different structure at any
level. Similarily, a uniform hash has any number of keys, but the
value for every key is the same structure.
A non-uniform list or hash has elements which do not have the same
structure.
Ordered/unordered
The ordered/unordered characteristic only applies to lists.
An ordered list is one in which the position in the list has
meaning: i.e. the 1st and 2nd elements in the list are not
interchangeable. An example might be an list of addresses where the
first address is a physical address and the second address is a
billing address.
An unordered lists is a list in which the order and placement of the
elements is not important. An example might be a list of clients.
There is no inherent meaning to being the first or second client in
the list.
By implication, because two elements in an unordered list are
interchangeable, they must be uniform.
When specifying structural characteristics for list and hash elements,
the path used depends on whether they are uniform or non-uniform. When
referring to any element that is in a uniform list or hash, a wildcard
character is used. For non-uniform lists and hashes, specific elements
are used in the paths.
For example, in a data structure which consists of a hash containing a
foo key, and that key contains a non-uniform list of elements, you might
specifify structural information for specific paths:
/foo/0
/foo/1
On the other hand, if the foo key contains a uniform list of elements,
you would specify structural information for ALL elements using the
path:
/foo/*
It is not allowed to have structural information simultaneously for both
types of paths. For example, there will never be structural information
for both:
/foo/1
/foo/*
There MAY be structural information for:
/foo/1
/bar/*
since there is no requirement that /foo and /bar are uniform.
DETERMINING AND SETTING STRUCTURAL INFORMATION
When working with an NDS, structural information can be determined from
the NDS, or it can be explicitly set in the program in which case
additional validity checking can be done on the NDS.
Structural information may be given as global defaults (i.e. it applies
to all paths), or on a path-specific basis. Structural information given
for a specific path applies only to that exact path. It does not apply
to structures lower OR higher an NDS.
The following structural information may be determined or set:
ordered
By default, all lists are treated as unordered, but that can be
overridden, either on the global level, or path specific level, with
this.
If this is set to 1, lists will default to ordered (in the global
case), or the list at a specific path is explicitly set to ordered.
If this is set to 0, the list(s) will be unordered.
uniform_hash
uniform_ol
These pieces of information apply only on a global level.
By default, hashes are not uniform. By setting this uniform_hash to
1, they will default to uniform.
By default, all ordered lists are uniform. By setting this
descriptor to 0, they will be treated as non-uniform.
Note that there is no uniform_ul descriptor because ALL unordered
lists are treated as uniform since there is no consistent way for
structural information to apply to an unordered list which does not
have uniform elements.
uniform
This piece of information applies only to a specific path.
This can apply either to an ordered list or a hash. It is invalid
for other data types. It sets the element at the given path to be
explicitly uniform or not uniform.
With respect to ordered lists, there are two caveats.
Caveat 1:
Hashes underneath the list element are uniform if the same key has
the same structure. It is not required that different keys have the
same structure.
For example, if the path "/a" refers to a uniform ordered list, and
structures at "/a/*" are hashes, then it is required that the
structure stored at "/a/0/key1" be the same as "/a/1/key1", but the
structure stored at "/a/0/key2" can be different.
Caveat 2:
Ordered lists underneath the list element are uniform if the
elements at the same position have the same structure.
For example, if the path "/a" refers to a uniform ordered list, and
structures at "/a/*" are ordered lists, then it is required that the
structure stored at "/a/0/0" be the same as "/a/1/0", but "/a/0/1"
may be different.
MERGING NDSes
One of the more fundamental tasks of this module is the ability to take
two NDSes and merge them together recursively. In it's simplest form,
this means that you set the value for a path in an NDS to some structure
which is structurally valid for that path. This module goes well beyond
that in capability though. Not only can you outright replace (or set)
the value at a path, but you can also recursively merge two structures
together. At every level of the merge, the data is combined based on the
merge method for that path and that type of data.
There are several different methods that can be used for merging NDSes.
Merging hashes
Merging hashes is conceptually the easiest. Allowed methods are
merge, keep, keep_warn, replace, replace_warn, or error.
Merging the two hashes:
%nds1 = ( a => NDS1,
b => NDS2,
c => undef )
%nds2 = ( b => NDS3,
c => NDS4 )
will give a resulting hash:
%nds = ( a => NDS1,
b => ???
c => NDS4 )
The "a" key is trivial. It is defined in %nds1, but is totally
missing in %nds2, so the value from %nds1 is used.
The "c" key is also trivial. It is defined in %nds1, but has no
value, so the value from %nds2 is used.
The "b" key value depends on the merge method.
If the method is keep, the first value is used, so
b => NDS2
If the method is replace, an existing value will be replaced with a
second value, so:
b => NDS3
In both of these cases, it is not necessary to recurse into the
structure.
If the method is merge, the resulting value is obtained by
recursively merging NDS2 and NDS3. If NDS2 and NDS3 are scalars (or
some other type of data other than lists or hashes), the rules for
choosing the value to be stored in the "b" key are covered below in
the "Merging scalars (or other)" section.
If the method is error, an error will occur if the key is defined in
both hashes, and the program will exit.
The methods keep_warn and replace_warn are equivalent to keep and
replace respectively except that a warning will be issued when a key
is defined in both hashes.
When merging two hashes, if a value for a key in the first hash is
empty, or an empty string (""), it is replaced by the value in the
second hash.
Merging lists
When merging lists, allowed methods are: merge, keep, keep_warn,
replace, replace_warn, append, and error.
Merging the two lists:
@list1 = ( NDS1a NDS1b undef NDS1d )
@list2 = ( NDS2a NDS2b NDS2c )
will give the following results depending on the merge method.
With the keep method, the resulting list will be:
@list = @list1
With the replace method, the resulting list will be:
@list = @list2
With the append method, the resulting list will be:
@list = (@list1 @list2)
With the merge method, the resulting list will be
@list = ( NDS1 NDS2 NDS2c NDS1d )
As with hashes, the 3rd element and 4th elements in the merged list
are trivial. The 3rd element is not defined in @list1, so the
elemenet from @list2 is used. Similar for the 4th element.
NDS1 is a recursive merger of NDS1a and NDS1b and NDS2 is a
recursive merger of NDS2a and NDS2b. If NDS2 and NDS3 are scalars
(or some other type of data other than lists or hashes), the rules
for choosing the value to be stored in the "b" key are covered below
in the "Merging scalars (or other)" section.
If the method is error, an error will occur if both lists have
elements, and the program will exit.
The methods keep_warn and replace_warn are equivalent to keep and
replace respectively except that a warning will be issued when both
lists have elements.
The append method is only available with unordered lists. The merge
method is only available with ordered lists.
Merging scalars (or other)
When data of type scalar or other are merged, allowed methods of
merging are keep, keep_warn, replace, replace_warn, and error.
Scalars or other types are merged when the parent structures are
merged recursively, and they include scalars at some level.
For example, given the two lists:
@list1 = ( a, undef, '', d )
@list2 = ( 1, 2, 3)
which are merged using the "merge" method, the list is recursed
into, so each individual sets of scalars are merged using the method
which applies at that level of the structure.
With the keep method, the resulting scalar is the first non-empty
value of the two scalars. With the replace method, the resulting
scalar is the last non-empty value of the two.
The keep_warn and replace_warn methods are identical but will
trigger a warning if two non-empty values are encountered.
With the error method, an error is triggered if two non-empty values
are encountered, and the program will exit.
The only difficulty is knowing what values are non-empty. If the
"keep" method is used, the merged list of the two lists above is:
@list = ( a, 2, ??, d)
The first element is obtained by merging "a" and "1", both of which
are non-empty, so the first value is kept.
Likewise with the second element, since an undef value is empty.
The fourth element is also trivial since the fourth element only
exists in one of the two lists.
The only question is how to treat the empty string in @list1. By
default, the empty string is treated as an empty value, so the
merged list would be:
@list = ( a, 2, 3, d )
This behavior can be changed using the blank method described below.
Passing a true value to it causes the empty string to be treated an
an empty value, so the resulting merged list would be:
@list = ( a, 2, '', d )
SPECIFYING MERGE INFORMATION
Merge information is used to determine how different parts of a data
structure are merged with other structures. Merge information may be
given for a specific path or as a global default.
In addition, merge information may be specified for different sets of
circumstances. For example, one set of circumstances might be to use one
data structure to provide defaults for another structure, but only when
that structure didn't already include a value. An alternate set of
circumstances would be to have the second data structure override values
in the first structure. Each set of circumstances may be given a ruleset
name, and merge information can be set (either as global defaults or for
a specific path) for that set set of circumstances. The named set of
circumstances is called a ruleset and is described in more detail below.
The following merge information may be set. Every item can be set on a
global default basis, or on a per-ruleset basis.
merge_hash
This specifies the default method to use when merging hashes.
If this is not specified, the "merge" method is the default.
merge_ol
This specifies the default method to use when merging ordered lists.
If not specified, the "merge" method is the default.
merge_ul
This specifies the default method to use when merging unordered
lists.
If not specified, the "append" method is the default.
merge_scalar
This specifies the default method to use when merging scalars.
If not specified, the "keep" method is the default.
merge
This specifies the merge method to be used for a specific path. It
can only be set for a specific path, but may be set on a per-ruleset
basis.
The method overrides any defaults.
RULE SETS
It is sometimes desirable to have multiple ways defined to merge two
NDSes for different sets of circumstances.
For example, sometimes you want to do a full merge of the NDSes, and
another time you want one of the NDSes to provide default values for
anything not defined in the other NDS, but you don't want to override
any value that is currently there.
A set of all of the different rules (including both global defaults, and
path specific methods) which should be applied under a given set of
circumstances is called a ruleset.
By default, a single unnamed ruleset is used, and all merging is done
using the rules defined there. Additional named rulesets may also be
added. One important difference is that default rules are automatically
supplied for the unnamed ruleset, but NOT for a named ruleset. If a
merge method cannot be determined in a named ruleset, it will default to
that of the unnamed ruleset.
Any number of named rulesets may be created. There are four reserved
rule sets named "default", "override", "keep" and "replace" that may not
be used.
The "default" ruleset has the following settings:
merge_hash = merge
merge_ul = keep
merge_ol = merge
merge_scalar = keep
If you merge two data structures using the "default" ruleset, the second
structure will provide defaults for the first. In other words, if the
first includes a scalar at some path, it will keep it, but otherwise, it
will take the value from the second structure.
The only exception is that unordered lists are not recursed into. If a
value is an unordered list, it will use an existing list in it's
entirety.
The "override" ruleset has the following settings:
merge_hash = merge
merge_ul = replace
merge_ol = merge
merge_scalar = replace
If you merge two data structures using the "override" ruleset, the
second structure will override the first.
The "keep" and "replace" rulesets are used to set a value at a given
path to a new value, possibly completely replacing any existing
structure. The "keep" ruleset will set the structure to a new value only
if it doesn't already exist. The "replace" ruleset will remove any
existing structure and replace it with the new value.
The "keep" ruleset has all settings set to "keep". The "replace" ruleset
has them all set to "replace".
USING A DATA::NDS OBJECT FOR MULTIPLE NDSes
Any number of NDSes (which share the same structure) can be associated
with a Data::Nested object. In that way, structural information can be
enforced across any number of similar data structures.
An NDS can be used explicitly in some Data::Nested method:
$obj = new Data::Nested;
$nds = { ... some structure ... };
$val = $obj->value($nds,$path);
or it can be stored in the Data::Nested object in a named slot, so that
it can be easily referenced by name:
$obj->nds("my_nds",$nds);
$val = $obj->value("my_nds",$path);
Any number of NDSes can be named and stored in the Data::Nested object.
BASE METHODS
The following are methods for creating setting options for a
Data::Nested object.
new
$obj = new Data::Nested;
This creates a new Data::Nested object. It can be used to work with
any number of NDSes that share the same structural information. In
order to work with NDSes with different structural information, you
must create separate Data::Nested objects for each.
version
$version = $obj->version();
Returns the version of the module.
no_structure
$obj->no_structure();
If this is called, it will turn off structural information. Because
NDS information is stored in the Data::Nested object, and turning
off structural information means that structural data will no longer
be kept, you cannot toggle it back on. Instead, you will need to
create a new Data::Nested object if structural information is
needed.
blank
$obj->blank(BOOLEAN);
This sets a flag which determines whether an empty string should be
treated as an empty value, or as a defined value.
In a data structure, anywhere a scalar is included, an empty string
('') may be included. When merging two such structures, there are
two ways to treat empty strings.
The default is to treat them as an empty value. For example, merging
two hashes:
%nds1 = ( a => '',
b => undef )
%nds2 = ( a => 1,
b => 2 )
using the "keep" method (for merging the scalars), would give:
%nds = ( a => 1,
b => 2 )
since each pair of scalars will be compared and the first non-empty
value will be kept.
If this method is called and a true value is passed in:
$obj->blank(1);
an empty string will be treated as a non-empty value, and it will be
kept, so the resulting merged structure would be:
%nds = ( a => '',
b => 2 )
err
errmsg
$err = $obj->err();
This tests to see if the last function failed. If it did, $err is
the error code set by that function.
Error codes in this module described and listed below in the ERROR
CODES section.
Every error also produces a text version of the error. The function:
$msg = $obj->errmsg();
will return the text version of the error.
PATH METHODS
When referring to the arguments passed to a method, $path always refers
to the path in an NDS. $path can be passed in as a delimited string, or
as a list reference where the list contains the elements of the path. So
the following are equivalent:
"/a/b/c"
[ "a", "b", "c" ]
When the argument $nds is passed in, it refers to an NDS. The NDS can
either be a reference to a structure, or the name of an NDS stored in
the object using the "nds" method.
delim
$obj->delim();
$obj->delim($delim);
When expressing the path as a string, the default delimiter is a
slash (/). This can be changed using this function. Any string can
be used as the delimiter. If called with no argument, it returns the
delimiter.
path
@path = $obj->path($path);
@path = $obj->path(\@path);
$path = $obj->path(\@path);
$path = $obj->path($path);
A path can be expressed in two different ways: a string with
elements separted by the path delimiter, or as a list of elements.
This method will convert between the two. In array context,it will
return a list of path elements. In scalar context,it will return the
path as a string with elements separated by the path delimiter.
It is safe to pass in the list reference in list context, or the
string version in scalar context. In both cases, the path will be
returned unmodified.
In string form, the path can be empty, or can consist only of the
delimiter, and all of these will return an empty list (i.e. they
point at the top level).
In string form, a path may include the delimiter as the first
character in the path, but it is optional, and the leading delimiter
does NOT imply anything about where the path starts. In other words:
/foo/1/bar
foo/1/bar
are identical.
RULESET METHODS
The following methods are used for creating a rulesets.
ruleset
$obj->ruleset($name);
This creates a ruleset of the given name. $name must be
alphanumeric, and must be created only a single time. The following
names are reserved and may not be used:
keep
replace
default
override
This sets an error code if a problem with the ruleset is
encountered.
ruleset_valid
$flag = $obj->ruleset_valid($name);
This returns 1 if $name is a valid ruleset, 0 otherwise.
NDS METHODS
These methods are for working with single NDSes. They can be used to
examine information in an NDS, or associate an NDS with a Data::Nested
object.
nds This function is for working with named NDSes stored in a
Data::Nested object.
There are several different ways in which this method can be called.
$obj->nds($name,$nds [,$new]);
This forms stores an NDS in the Data::Nested object under a given
name. If structural information is kept, it will check the structure
of the NDS for problems. It will update structural information based
on the NDS if a non-zero value of $new is passed in.
$obj->nds($name,$name2);
This takes an NDS stored under the name $name2 and stores a copy of
it under the new name ($name).
$nds = $obj->nds($name [,"_copy"]);
This will retrieve the NDS stored under the name $name. If it does
not exist, undef is returned. It returns the actual stored NDS, NOT
a copy of it, if there is no second argument. If the second argument
is "_copy", a copy of the structure is returned.
$flag = $obj->nds($name,"_delete");
This will delete the named NDS from the object. If the named NDS
does not exist, it will return 0, otherwise it will return 1.
$flag = $obj->nds($name,"_exists");
Returns 1 if an NDS is stored under the given name.
This method may produce error codes due to invalid structure, or if
any problem using a named NDS are encountered.
empty
$isempty = $obj->empty($nds);
By default, an NDS is empty if it only contains empty values.
A scalar is empty if it is undef. By default, the empty string "" is
also treated as empty, but this can be changed using the "blank"
method described above.
A list is empty if it contains 0 elements, or if every element in it
is empty.
A hash is empty if it contains 0 keys, or if the value of every key
is empty.
Returns undef if an error occurs. Otherwise, it returns 1 if $nds is
empty, 0 if it is not empty.
value
$val = $obj->value($nds,$path [,$copy,$nocheck]);
This checks to see that the NDS passed in is valid, and if the given
path exists in it. If $nocheck is passed in, the NDS structure isn't
checked.
If everything is valid, it returns the value stored at $path. If
$copy is passed in, a copy of the structure stored there is
returned, otherwise the actual structure is returned.
In the case of an error, nothing is returned and an error code is
set.
keys, values
@ele = $obj->keys($nds,$path);
@ele = $obj->values($nds,$path);
This takes an NDS and returns a list of items at the given path.
If the object at the path is a scalar, the keys method returns
nothing. The values method returns the scalar.
If the object at the path is a list, the keys method returns some of
the integers 0..N where N is the index of the last element in the
list. The indices for empty elements are omitted. The values method
returns the non-empty members of the list.
If the object at the path is a hash, the kyes method returns the
non-empty keys of the hash. The values method returns the members of
the list. The values method returns the non-empty values of the
hash.
Undef is returned in the case of an error.
erase
$flag = $obj->erase($nds,$path);
This will delete the given path from the NDS. It will delete
elements from lists, clear elements from ordered lists, or delete
entries from hashes.
It returns undef if an error occurred, 1 if the path was erased, 0
otherwise.
STRUCTURE METHODS
These methods are for working with the structure of an NDS.
set_structure
$obj->set_structure($item,$val [,$path]);
This sets the given item of structural information. If the path is
given, it sets items for that path, otherwise it sets default
structural items.
get_structure
$val = $obj->get_structure($path [,$info]);
This gets a piece of structural information for a path. $info can be
any of the following (and defaults to "type" if it is not given):
type (returns "unknown" if not set)
ordered
uniform
merge
keys
valid (this will return 1 if the path is valid)
The appropriate value is returned. If information for a specific
path is not available, default values will be returned. It returns
nothing if the path has no structural information available and the
error code lists the reason.
The keys information is a list of all known keys that can appear in
a hash. This may only be used to query keys in a non-uniform hash.
check_structure
$obj->check_structure($nds [,$new]);
This will take an NDS and traverse through it, checking the
structure of every part of it. If $new is passed in, it is allowed
to contribute new structural information. Otherwise, it must be
completely defined by previously declared structural information. If
the structure is invalid, an error code will be set.
check_value
$obj->check_value($path,$val [,$new]);
This will check to see if $val has the correct structure to be
stored at $path in an NDS. It will traverse through the structure of
$val, similar to how the check_structure method traverses through an
entire NDS.
The values of $new, $err, and $errmsg are the same as in the
check_structure method.
METHODS FOR MERGING OR SETTING VALUES IN AN NDS
These methods allow you to merge two NDSes together. A simple case of
this is when a path in an NDS is not set. Merging in a second NDS is
equivalent to simply setting the path in the first NDS to the value
supplied.
set_merge
$obj->set_merge($item,$method [,$ruleset]);
$obj->set_merge($item,$path,$method [,$ruleset]);
This will define how to merge values. In the first form, it will set
the default. $item can be merge_hash, merge_ol, merge_ul, or
merge_scalar. In the second form, it will set the merge method for
the given path. Currently, $item must be "merge".
get_merge
$method = $obj->get_merge($path [,$ruleset]);
This gets the merge method for a path.
The appropriate value is returned. If the method for a specific path
is not available, default values will be returned. Nothing will be
returned in the event of a problem.
merge
$obj->merge($nds1,$nds2 [,$ruleset] [,$new]);
This will take two NDSes (each of which can be passed in by name or
by reference) and will recursively merge the second one into the
first based on the rules of merging.
The second NDS will be copied, so no part of the merged NDS will
contain actual parts of the seconds NDS.
The name of a ruleset can be passed in. If it is, that set of merge
rules will be used to do the merging.
If $new is passed in, it must be 0 or 1. If it is 1, Either NDS may
provide new structural information.
merge_path
$obj->merge_path($nds,$val,$path [,$ruleset] [,$new]);
This will take an NDS (which can be passed in by name or reference)
and merge $val into it at the given path. Using the special rulesets
"replace", the value will replace whatever is there.
$path must be valid, and $val must be structurally correct if
structural information is kept.
It will update structural information based on the NDS if $new is
passed in and is true.
The actual value passed in (not a copy) will be merged in.
The merge_path method is used whenever you want to perform the
operation "set PATH in an NDS to VALUE". If PATH is not yet defined
in NDS, the following call does the expected operation:
$obj->merge_path($nds,$val,$path [,$new]);
If PATH already exists in NDS and you want to overwrite it, use the
following:
$obj->merge_path($nds,$val,$path,"replace" [,$new]);
OTHER METHODS
which
%hash = $obj->which($nds,@args)
This returns a hash of { PATH => VAL } where PATH is a path in $nds
and VAL is the value at that path.
The paths returned all fit the criteria specified in the arguments.
If no arguments are passed in, a hash of all paths to non-empty
scalars is returned (note that this means that scalars set to the
empty string '' ARE returned).
If @args is passed in, it is a list of criteria. If a scalar matches
any one of them, it passes. Currently, @args may consist of a list
of values (scalars) or regular expressions (set using the qr//
operator). If the value at a path is equal to any of the values
passed in in @args, or matches any of the regular expressions, then
it passes.
test_conditions
$flag = $obj->test_conditions($nds [,$path1,$cond1,$path2,$cond2,...]);
This returns a 1 if the given NDS meets all of the conditions in the
list. Any number of path/cond pairs may be given, and the NDS is
required to pass all of them.
If $path refers to a hash structure, $cond may be any of the
following:
exists:VAL : true if a key named VAL exists in the hash
empty:VAL : true if a key named VAL is empty in the hash
(it doesn't exist, or has an empty value)
empty : true if the hash is empty
If $path refers to a list structure, $cond may be any of the
following:
empty : true if the list is empty
defined:VAL : true if the VAL'th (VAL is an integer) element
is defined (indices start at 0)
empty:VAL : true if the VAL'th (VAL is an integer) element
is empty (or not defined)
contains:VAL : true if the list contains the element VAL
<:VAL : true if the list has fewer than VAL (an integer)
non-empty elements
<=:VAL
=:VAL
>:VAL
>=:VAL
VAL : equivalent to contains:VAL
If $path refers to a scalar, $cond may be any of the following:
defined : true if the value is defined
empty : true if the value is empty
zero : true if the value defined and evaluates to 0
true : true if the value defined and evaluates to true
=:VAL : true if the the value is VAL
member:VAL:VAL:...
: true if the value is any of the values given (in
this case, ALL of the colons (including the first
one) can be replace by any other single character
separator
VAL : equivalent to =:VAL
All conditions can be prefixed by a "!" to negate it.
identical, contains
$flag = $obj->identical($nds1,$nds2 [,$new] [,$path]);
$flag = $obj->contains ($nds1,$nds2 [,$new] [,$path]);
The identical method checks to see if two NDSes are identical. If
$path is given, only the part that starts at $path is checked.
When comparing ordered lists, every element must be identical and in
the same ordered. Unordered lists need to contain the same elements,
but not necessarily in the same order. This works even if the
unordered list contains structures instead of scalars.
The contains method checks to see that $nds2 is a subset (i.e.
contained in) $nds1. In other words, every scalar in $nds2 is
identical to one in $nds1.
undef is returned if there is any error.
NOTE: because unordered lists must be compared in every possible
combination, and recursively, if the structure contains unordered
lists which contain other unordered lists deeper in the structure,
comparing NDSes with unordered lists can be extremely slow. Doing
this is strongly discouraged.
Error codes are:
1 the first NDS is invalid
2 the second NDS is invalid
print
$string = $obj->print($nds,%opts);
This formats an NDS as a string based on a set of options. Known
options are:
indent => NUMBER Specifies the amount of indentation to add
at each level. Indent must be 1 or more.
Default: 3
width => NUMBER Specifies the width of a printing area. A
value of 0 means to not impose any width
limit. The minimum allowed width (other than
0) is 20.
Default: 79
maxlevel => NUMBER Specifies the maximum level to print. A value of
0 is all levels, but may only be used then a
width of 0 is used. Levels beyond this will
be pruned.
Default: the number of levels that can be
displayed given indent and width
paths
@path = $obj->paths(@type);
This returns a list of all valid paths of the given type. @type is a
list of strings. The list can contain any one of the following:
scalar
list
hash
It can optionally also contain one of:
uniform
nonuniform
and one of:
ordered
unordered
It will return all paths which match all of the values.
Any method not documented here, especially those beginning with an
underscore (_), are for internal use only. Please do not use them.
Absolutely no support is offered for them.
ERROR CODES
Each error code produced by a method in the Data::Nested module is
prefixed by the characters "nds", followed by a 3 character operation
code which tells what type of operation failed, followed by 2 digits.
The following error codes are used to identify problems working with
named NDSes:
ndsnam01 A named NDS was referred to, but no NDS is stored
under that name.
ndsnam02 Attempt to copy an NDS to a name already in use.
The following error codes are set if a problem with a ruleset is
encountered:
ndsrul01 A non-alphanumeric character used in a ruleset
name.
ndsrul02 An attempt was made to create a ruleset using
a name that is already in use.
ndsrul03 An attempt was made to create a ruleset using one
of the reserved names.
The following error codes are set if there is a problem checking the
structure of an NDS:
ndschk01 The NDS contains structure of a different type
than is valid. The errmsg method will tell exactly
where the error occurred.
ndschk02 An NDS with new structure was checked, but new
structure is not allowed. Use the $new argument
in the calling function to allow it.
ndschk03 No structural information is available at all.
ndschk04 The path is invalid.
ndschk05 It is unknown what type of data is stored at the
given path.
ndschk06 Ordered information requested for a non-list structure.
ndschk07 Uniform information requested for a scalar/other
structure.
ndschk08 Keys requested for a non-hash structure.
ndschk09 Keys requested for a uniform hash structure.
ndschk99 Unknown structural information requested.
The following eror codes are set if there a problem setting the
structural information of an NDS:
ndsstr01 Attempt to set type to an invalid value.
ndsstr02 Once type is set, it may not be reset.
ndsstr03 Attempt to set type to scalar when a list/hash type is
required (due to other structural information).
ndsstr04 Attempt to reset "ordered" (or trying to set a
non-uniform list to unordered).
ndsstr05 Attempt to set ordered on a non-list structure.
ndsstr06 Ordered value must be 0 or 1.
ndsstr07 Attempt to reset "uniform" (or trying to set an
unordered list to non-uniform).
ndsstr08 Attempt to use an "uniform" flag on something other
than a list/hash.
ndsstr09 Uniform value must be 0 or 1.
ndsstr10 Attempt to set structural information for a child with
a scalar/other parent.
ndsstr11 Attempt to set structural information for a specific
element in a "uniform" list.
ndsstr12 Attempt to set structural information for all
elements in a "non-uniform" list.
ndsstr13 Attempt to access a list with a non-integer index.
ndsstr14 Attempt to set structural information for a specific
element in a uniform hash/list.
ndsstr15 Attempt to set structural information for all elements
of a non-uniform hash/list.
ndsstr16 Attempt to set the default ordered value to something
other than 0/1.
ndsstr17 Attempt to set the default uniform_hash value to
something other than 0/1.
ndsstr18 Attempt to set the default uniform_ol value to
something other than 0/1.
ndsstr98 Invalid default structural item.
ndsstr99 Invalid structural item for a path.
The following error codes are used to report problems when examining an
NDS, either to get data, or to get structural information:
ndsdat01 A path does not exist in the NDS.
ndsdat02 A hash key does not exist in the NDS.
ndsdat03 A list element does not exist in the NDS.
ndsdat04 The NDS has a scalar at a point where a hash or
list should be.
ndsdat05 The NDS has a reference to an unsupported data type
where a hash or list should be.
ndsdat06 A non-integer index used to access a list.
ndsdat07 Invalid parameter combination in paths method.
ndsdat08 Invalid parameter in paths method.
The following error codes are set with problems related to merge
operations:
ndsmer01 Attempt to set a merge setting to an unknown value.
ndsmer02 Attempt to set merge_hash to an invalid value.
ndsmer03 Attempt to set merge_ol to an invalid value.
ndsmer04 Attempt to set merge_ul to an invalid value.
ndsmer05 Attempt to set merge_scalar to an invalid value.
ndsmer06 Attempt to reset "merge" value for a path.
ndsmer07 Attempt to set "merge" for a path with no known type.
ndsmer08 Invalid merge method for ordered list merging.
ndsmer09 Invalid merge method for unordered list merging.
ndsmer10 Invalid merge method for hash merging.
ndsmer11 Invalid merge method for scalar/other merging.
ndsmer12 While merging, the first NDS is not defined.
ndsmer13 While merging, the second NDS is not defined.
ndsmer14 The first NDS has an invalid structure. Use the
check_structure method to determine the problem.
ndsmer15 The second NDS has an invalid structure. Use the
check_structure method to determine the problem.
ndsmer16 The NDS must be a list or hash.
ndsmer17 Attempt to merge a value into an undefined NDS.
ndsmer18 The NDS has an invalid structure.
ndsmer19 The value has an invalid structure.
The following error codes apply to identical and contains operations:
ndside01 The first NDS is invalid.
ndside02 The second NDS is invalid.
The following error codes apply to test conditions:
ndscon01 An invalid test condition used.
EXAMPLES
All examples assume the following lines:
use Data::Nested;
$obj = new Data::Nested;
path method
The path function can be used to switch back and forth between a
path in string format and a path in list format.
@path = $obj->path("/a/b");
=> ( a b )
@path = $obj->path("a/b");
=> ( a b )
@path = $obj->path(["a","b"]);
=> ( a b )
$path = $obj->path("/a/b");
=> /a/b
$path = $obj->path("a/b");
=> /a/b
$path = $obj->path(["a","b"]);
=> /a/b
@path = $obj->path("/");
=> ( )
$path = $obj->path([]);
=> "/"
nds method
The nds method can be used to store or access a named NDS.
$nds = { "a" => [ "a1", "a2" ],
"b" => [ "b1", "b2" ] };
$obj->nds("ele1",$nds,1);
$nds2 = $obj->nds("ele1");
=> { "a" => [ "a1", "a2" ],
"b" => [ "b1", "b2" ] }
value method
The value method is used to check to see if a path is valid in the
given NDS. It returns the value stored at the path, if it is valid.
It also sets an error code if the NDS is not valid.
$nds = { "a" => undef,
"b" => "foo",
"c" => [ "c1", "c2" ],
"d" => { "d1k" => "d1v", "d2k" => "d2v" },
};
$obj->value($nds,"/a");
=> undef
$obj->value($nds,"/d/d3k");
=> undef
$obj->value($nds,"/f/1/2");
=> undef
$obj->value($nds,"/c/1");
=> c2
$obj->value($nds,"/c/x");
=> undef
keys, values methods
Using the samd NDS as defined in the "valid" examples.
$obj->keys($nds,"/b");
=> ( )
$obj->keys($nds,"/c");
=> ( 0 1 )
$obj->keys($nds,"/d");
=> ( d1k d2k )
$obj->values($nds,"/b");
=> ( foo )
$obj->values($nds,"/c");
=> ( c1 c2 )
$obj->values($nds,"/d");
=> ( d1v d2v )
set_structure, get_structure methods
These set or report the structure at a path.
set_structure sets a piece of structural information for a path and
returns an error code (0 if successful).
To make sure that the path "/a" refers to a uniform hash, make the
following two calls:
$obj->set_structure("type","hash","/a");
$obj->set_structure("uniform",1,"/a");
To make sure that "/b" is an ordered list, and all elements in it
are hashes, use the following calls:
$obj->set_structure("type","list","/b");
$obj->set_structure("ordered",1,"/b");
$obj->set_structure("type","hash","/b/*");
get_structure will return the structural information for a path:
$info = $obj->get_structure("/b","type");
=> list
erase method
$obj->set_structure("ordered","1","/o");
$obj->set_structure("ordered","0","/u");
$nds = { "h" => { "x" => 11, "y" => 22 },
"o" => [ qw(alpha beta gamma delta) ],
"u" => [ qw(alpha beta gamma delta) ],
};
Erasing a hash key removes the key and value.
$obj->erase($nds,"/h/x");
=> $nds = { h => { y => 22 },
o => [ alpha beta gamma delta ],
u => [ alpha beta gamma delta ],
}
Erasing an element in an ordered list replaces it with an undef
place holder.
$obj->erase($nds,"/o/1");
=> $nds = { h => { y => 22 },
o => [ alpha UNDEF gamma delta ],
u => [ alpha beta gamma delta ],
}
Erasing an element from an unordered list removes it completely.
$obj->erase($nds,"/u/1");
=> $nds = { h => { y => 22 },
o => [ alpha UNDEF gamma delta ],
u => [ alpha gamma delta ],
}
check_structure method
You can use the set_structure routine to enforce structure. For
example, if you want an NDS to be a hash, and in that hash are two
keys "hu" who's value is a uniform hash, and "ul" who's value is an
unordered list, use the following:
$obj->set_structure("type","hash","/hu");
$obj->set_structure("uniform",1,"/hu");
$obj->set_structure("type","list","/ul");
$obj->set_structure("ordered",0,"/ul");
To check a structure to see if it fits this structure, use the
check_structure method:
$a = { "hu" => { "h1" => "h1v" } };
$obj->check_structure($a,1);
$b = { "hu" => [ 1, 2 ] };
$obj->check_structure($b,1);
You can also add structural information by passing in an NDS that
goes beyond whatever structure you have defined with set_structure.
Additional structure will be determined from that structure IF you
pass in a non-null value as the second argument. If no second
argument is passed in (or a null value is passed in), the NDS being
checked must have only the structure that has already been defined.
For example:
$b = { "ul" => [ { "aa" => 11 } ] };
$obj->check_structure($b,0);
$b = { "ul" => [ { "aa" => 11 } ] };
$obj->check_structure($b,1);
In the first instance, the check_structure function returns an error
code since the structure passed in contains structure that was not
defined in the set_structure calls above.
In the second instance, the added structure is examined and
additional structural information is deternubed.
Since "ul" is defined as an unordered (and therefore uniform) list,
all of it's members must be identical. They are set to hashes based
on the above check_structure call, so the following will fail since
it tries to set them to scalars:
$c = { "ul" => [ "foo" ] };
$obj->check_structure($c,1);
set_merge, get_merge methods
To set the default merge method for a hash to be "keep" (see above
for description of the various merge methods):
$obj->set_merge("merge_hash","keep");
To set the merge method for a single element in an NDS, use
something like the following:
$err = $obj->set_structure("type","hash","/h");
$obj->set_merge("merge","/h","keep");
The get_merge method can be used to query the type of merge that is
done for a path:
$obj->get_merge("/h");
=> keep
identical, contains methods
$a = { "a" => "foo",
"b" => "bar",
"c" => "baz" };
$b = { "a" => "foo",
"b" => "bar",
"c" => "baz" };
$obj->identical($a,$b,1);
=> 1
$obj->contains($a,$b,1);
=> 1
$c = { "a" => "foo",
"c" => "baz" };
$obj->identical($a,$c,1);
=> 0
$obj->contains($a,$c,1);
=> 1
When looking at unordered lists, elements do not need to be in the
same order:
$a = [ qw(a b c) ];
$b = [ qw(b a c) ];
$obj->identical($a,$b,1);
=> 1
Unordered lists can contain unordered lists and they still work:
$a = [ [ qw(a b c) ], [ qw(d e f) ], [ qw(g h i) ] ];
$b = [ [ qw(d e f) ], [ qw(a b c) ], [ qw(i g h) ] ];
$obj->identical($a,$b,1);
=> 1
This works regardless of the number of unordered lists and the
intermediate structure (for example: unordered list of hashes
pointing to unordered lists). Every time an unordered list is
encountered, every possible combination will be tried. This can be
very slow so care should be excercised in comparing structures
containing unordered lists.
merge method
Merging hashes using keep, replace, and merge:
$obj->set_merge("merge_hash","keep");
$a = { "a" => 1,
"b" => 2 };
$b = { "a" => 3,
"c" => 4 };
$obj->merge($a,$b,1);
=> $a = { a => 1,
b => 2 }
$obj->set_merge("merge_hash","replace");
$a = { "a" => 1,
"b" => 2 };
$b = { "a" => 3,
"c" => 4 };
$obj->merge($a,$b,1);
=> $a = { a => 3,
c => 4 }
$obj->set_merge("merge_hash","merge");
$a = { "a" => 1,
"b" => 2 };
$b = { "a" => 3,
"c" => 4 };
$obj->merge($a,$b,1);
=> $a = { a => 1,
b => 2,
c => 4 }
Merging unordered lists using keep, replace, and append:
$obj->set_structure("ordered",0);
$obj->set_merge("merge_ul","keep");
$a = [ qw(a b c) ];
$b = [ qw(d e f) ];
$obj->merge($a,$b,1);
=> $a = [ a b c ]
$obj->set_structure("ordered",0);
$obj->set_merge("merge_ul","replace");
$a = [ qw(a b c) ];
$b = [ qw(d e f) ];
$obj->merge($a,$b,1);
=> $a = [ a b c ]
$obj->set_structure("ordered",0);
$obj->set_merge("merge_ul","append");
$a = [ qw(a b c) ];
$b = [ qw(d e f) ];
$obj->merge($a,$b,1);
=> $a = [ a b c d e f ]
Merging ordered lists using keep, replace, and merge:
$obj->set_structure("ordered",1);
$obj->set_merge("merge_ol","keep");
$a = [ "a", "", "b" ];
$b = [ "c", "d", "" ];
$obj->merge($a,$b,1);
=> $a = [ a '' b ]
$obj->set_structure("ordered",1);
$obj->set_merge("merge_ol","replace");
$a = [ "a", "", "b" ];
$b = [ "c", "d", "" ];
$obj->merge($a,$b,1);
=> $a = [ c d '' ]
$obj->set_structure("ordered",1);
$obj->set_merge("merge_ol","merge");
$a = [ "a", "", "b" ];
$b = [ "c", "d", "" ];
$obj->merge($a,$b,1);
=> $a = [ a d b ]
A more complex example. Given structures consisting of ordered lists
of hashes, merge them recursively.
$a = [ { "a" => 1,
"b" => 2 },
{ "c" => 3 },
{},
{ "d" => 4,
"e" => 5 } ];
$b = [ { "a" => 11,
"w" => 22 },
{},
{ "x" => 33 },
{ "d" => 44 } ];
$obj->set_structure("type", "list", "/");
$obj->set_structure("ordered", 1, "/");
$obj->set_structure("type", "hash", "/*");
$obj->set_merge("merge", "/", "merge");
$obj->set_merge("merge", "/*", "merge");
$obj->merge($a,$b,1);
=> $a = [ { a => 1, b => 2, w => 22 },
{ c => 3 },
{ x => 33 },
{ d => 4, e => 5 } ]
merge_path method
merge_path is very similar to merge except that it merges a value
into a full NDS starting at a specific path. For example:
$a = { "a" => [ 1,2,3 ],
"b" => [ 4,5,6 ] };
$obj->merge_path($a,[7,8,9],"/c",1);
=> $a = { a => [ 1 2 3 ],
b => [ 4 5 6 ],
c => [ 7 8 9 ] };
which method
$nds = { "b" => "foo",
"c" => [ "c1", "c2" ],
"d" => { "d1k" => "d1v", "d2k" => "d2v" },
};
You can search for paths for a list of all scalars:
%p = $obj->which($nds);
=> %p = ( /b => foo
/c/0 => c1
/c/1 => c2
/d/d1k => d1v
/d/d2k => d2v )
For a subset of scalars:
%p = $obj->which($nds,"c2","d1v");
=> %p = ( /c/1 => c2
/d/d1k => d1v )
For a set that matches regular expressions:
%p = $obj->which($nds,qr/^c/);
=> %p = ( /c/0 => c1
/c/1 => c2 )
using rulesets
Rulesets are powerful tools for determining how you merge data
structures.
There are four very common uses of rulesets. They are so commonly
used that pre-existing rulesets have been defined for them, but any
number of other rulesets may also be defined.
The "replace" ruleset may be used to set the structure stored at a
path overriding any value currently there (but it will NOT replace
structural information, so it can't be used to redefine what
constitutes a valid structure).
$a = { "a" => [ 1,2,3 ],
"b" => [ 4,5,6 ] };
$obj->merge_path($a,[7,8,9],"/b","replace",1);
=> $a = { a => [ 1 2 3 ],
b => [ 7 8 9 ] }
The "keep" ruleset will set the structure only if it isn't already
set.
$a = { "a" => [ 1,2,3 ],
"b" => [ 4,5,6 ] };
$obj->merge_path($a,[7,8,9],"/b","keep",1);
=> $a = { a => [ 1 2 3 ],
b => [ 4 5 6 ] }
The "default" ruleset will set defaults for a structure.
$a = { "a" => 1,
"b" => 2 };
$d = { "a" => 11,
"b" => 22,
"c" => 33 };
$obj->merge($a,$d,"default",1);
=> $a = { a => 1,
b => 2,
c => 33 }
The "override" ruleset will recursively override all values.
$a = { "a" => 1,
"b" => 2 };
$d = { "a" => 11,
"c" => 33 };
$obj->merge($a,$d,"override",1);
=> $a = { a => 11,
b => 2,
c => 33 }
BACKWARDS INCOMPATIBILITIES
3.11
Renamed the module.
The original name of the module was Data::NDS. When I tried to
register that name with the perl module list, they felt that using
an acronym (NDS) did not make the module's purpose clear, and
requested that I give it a name that made it clear what the module
did.
After some discussion, Data::Nested was chosen.
Data::Nested is completely backward compatible with Data::NDS, and
switching Data::NDS to Data::Nested everywhere it appears is the
only change necessary.
3.00
The structure method was removed and replaced with a no_structure
method.
The handling of values which are the empty string is now consistent,
but not completely backwards compatible.
Added the err and errmsg functions, and changed the return values of
almost all of the functions (errors are no longer returned).
The word "array" was changed to "list" everywhere.
1.01
The keys and values methods now only return non-empty elements.
1.04
When working with an NDS, sometimes operations were performed on the
actual structure, sometimes on copies of the structure. It is now
documented which is which (and some behaviors were changed to be
more consistent).
BUGS AND QUESTIONS
If you find a bug in this module, please send it directly to me (see the
AUTHOR section below). Alternately, you can submit it on CPAN. This can
be done at the following URL:
http://rt.cpan.org/Public/Dist/Display.html?Name=Data-NDS
Please do not use other means to report bugs (such as usenet newsgroups,
or forums for a specific OS or linux distribution) as it is impossible
for me to keep up with all of them.
When filing a bug report, please include the following information:
* The version of the module you are using. You can get this by using
the script:
use Data::Nested;
$obj = new Data::Nested;
print $obj->version(),"\n";
* The output from "perl -V"
If you have a problem using the module that perhaps isn't a bug (can't
figure out the syntax, etc.), you're in the right place. Go right back
to the top of this manual and start reading. If this still doesn't
answer your question, mail me directly.
KNOWN PROBLEMS
None at this point.
SEE ALSO
perlreftut - Perl references short introduction
perldsc - Perl data structures intro
perllol - Perl data structures: arrays of arrays
perldata - Perl data structures
LICENSE
This script is free software; you can redistribute it and/or modify it
under the same terms as Perl itself.
AUTHOR
Sullivan Beck (sbeck@cpan.org)