Data::Seek - Search Complex Data Structures
version 0.09
use Data::Seek; my $hash = {...}; my $seeker = Data::Seek->new(data => $hash); my $result = $seeker->search(...); my $data = $result->data;
Data::Seek is used for querying complex data structures. This module allows you to select and return specific node(s) in a hierarchical data structure using a simple and intuitive query syntax. The results can be returned as a list of values, or as a hash object in the same shape as the original.
During the processing of flattening a data structure with nested data, the following data structure would be converted into a collection of endpoint/value pairs.
{ 'id' => 12345, 'patient' => { 'name' => { 'first' => 'Bob', 'last' => 'Bee' } }, 'medications' => [{ 'aceInhibitors' => [{ 'name' => 'lisinopril', 'strength' => '10 mg Tab', 'dose' => '1 tab', 'route' => 'PO', 'sig' => 'daily', 'pillCount' => '#90', 'refills' => 'Refill 3' }], 'antianginal' => [{ 'name' => 'nitroglycerin', 'strength' => '0.4 mg Sublingual Tab', 'dose' => '1 tab', 'route' => 'SL', 'sig' => 'q15min PRN', 'pillCount' => '#30', 'refills' => 'Refill 1' }], }] }
Given the aforementioned data structure, the following would be the resulting flattened structure comprised of endpoint/value pairs.
{ 'id' => 12345, 'medications:0.aceInhibitors:0.dose' => '1 tab', 'medications:0.aceInhibitors:0.name' => 'lisinopril', 'medications:0.aceInhibitors:0.pillCount' => '#90', 'medications:0.aceInhibitors:0.refills' => 'Refill 3', 'medications:0.aceInhibitors:0.route' => 'PO', 'medications:0.aceInhibitors:0.sig' => 'daily', 'medications:0.aceInhibitors:0.strength' => '10 mg Tab', 'medications:0.antianginal:0.dose' => '1 tab', 'medications:0.antianginal:0.name' => 'nitroglycerin', 'medications:0.antianginal:0.pillCount' => '#30', 'medications:0.antianginal:0.refills' => 'Refill 1', 'medications:0.antianginal:0.route' => 'SL', 'medications:0.antianginal:0.sig' => 'q15min PRN', 'medications:0.antianginal:0.strength' => '0.4 mg Sublingual Tab', 'patient.name.first' => 'Bob' 'patient.name.last' => 'Bee', }
This structure provides the endpoint strings which will be matched against using the querying strategy.
During the processing of querying the data structure, the criteria (query expressions) are converted into a series of regular expressions to be applied sequentially, filtering/reducing the endpoints and producing a data set of matching nodes or throwing an exception explaining the search failure.
Node Expression
my $result = $seeker->search(...); # given "id" { id => 12345 }
The node expression is a part of a criterion, which preforms an exact match against a node in the data structure. It is a string which can contain letters, numbers, and/or underscores.
Step Expression
my $result = $seeker->search(...); # given "patient.name.first" { patient => { name => { first => "Bob" } } } # given "patient.name.last" { patient => { name => { last => "Bee" } } }
The step expression is a criterion, or part of a criterion, made up of one or more node expressions separated using the period character, which matches against nodes in the data structure. It is a string which can contain letters, numbers, and/or underscores, separated using periods.
Index Expression
my $result = $seeker->search(...); # given "medications:0.aceInhibitors:0.dose" { medications => [{ aceInhibitors => [{ dose => "1 tab" }] }] } # given "medications:0.aceInhibitors:0.name" { medications => [{ aceInhibitors => [{ name => "lisinopril" }] }], } # given "medications:0.aceInhibitors:0.pillCount" { medications => [{ aceInhibitors => [{ pillCount => "#90" }] }] }
The index expression is a criterion, or part of a criterion, having a node expressions suffixed with a colon followed by a number denoting that it should only match an array which has an index corresponding to the numeric portion of the suffix. It is a string which can contain letters, numbers, and/or underscores, suffixed with a semi-colon followed by a number.
Iterator Expression
my $result = $seeker->search(...); # given "@medications.@aceInhibitors.dose" { medications => [{ aceInhibitors => [{ dose => "1 tab" }] }] } # given "@medications.@aceInhibitors.name" { medications => [{ aceInhibitors => [{ name => "lisinopril" }] }], } # given "@medications.@aceInhibitors.pillCount" { medications => [{ aceInhibitors => [{ pillCount => "#90" }] }] }
The iteration expression is a criterion, or part of a criterion, having a node expressions preceded by an "at" character denoting that the node expression should match all nodes in the data structure which are mapped to array objects. It is a string which can contain letters, numbers, and/or underscores, preceded by a single ampersand character.
Wildcard Expression
my $result = $seeker->search(...); # given "*" { id => 12345 } # given "*.*.first" { patient => { name => { first => "Bob" } } } # given "*.*.last" { patient => { name => { last => "Bee" } } } # given "patient.*.first" { patient => { name => { first => "Bob" } } } # given "patient.*.last" { patient => { name => { last => "Bee" } } } # given "@*.@*.pillCount" { medications => [{ aceInhibitors => [{ pillCount => "#90" }], antianginal => [{ pillCount => "#30" }], }], }
The wildcard expression is a criterion, or part of a criterion, which matches against a single node having a single "star" character match and represent one node expression. It is a string which can contain letters, numbers, underscores, and/or a single star character.
Greedy-Wildcard Expression
my $result = $seeker->search(...); # given "**.first" { patient => { name => { first => "Bob" } } } # given "**.last" { patient => { name => { last => "Bee" } } } # given "patient.**" { patient => { name => { first => "Bob", last => "Bee" } } } # given "medications**.pillCount" { medications => [{ aceInhibitors => [{ pillCount => "#90" }], antianginal => [{ pillCount => "#30" }], }], }
The greedy-wildcard expression is a criterion, or part of a criterion, which matches against any multitude of nodes having a double "star" character match and represent one or more of any character. It is a string which can contain letters, numbers, underscores, and/or a double star character.
$seeker->data; $seeker->data({...});
The data structure to be introspected, must be a hash reference, which is coerced into a Data::Object::Hash object.
$seeker->ignore; $seeker->ignore(1);
Bypass exceptions thrown when a criterion is invalid or no data matches can be found. This attribute must be an integer, which is coerced into a Data::Object::Integer object.
my $search = $seeker->search('id', 'person.name.*');
Prepare a search object to use the supplied criteria and return a search object. Introspection is triggered when the result method is enacted. See Data::Seek::Search for usage information.
The follow is a short and simple overview of the strategy and syntax used by Data::Seek to query complex data structures. The overall idea behind Data::Seek is to flatten/fold the data structure, reduce it by applying a series patterns, then, unflatten/unfold and operate on the new data structure. The introspection strategy is to flatten the data structure producing a non-hierarchical data structure where its keys represent endpoints (using dot-notation and colons to separate (and denote) nested hash keys and array indices respectively) within the structure.
Al Newkirk <anewkirk@ana.io>
This software is copyright (c) 2014 by Al Newkirk.
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.
To install Data::Seek, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Data::Seek
CPAN shell
perl -MCPAN -e shell install Data::Seek
For more information on module installation, please visit the detailed CPAN module installation guide.