Data::Seek::Concepts - Data::Seek Concepts
version 0.06
This document contains a simple overview of the strategy and syntax used by Data::Seek to query complex data strictures. The overall idea behind Data::Seek is to flatten/fold the data structure once, then reduce it by applying a series patterns.
The first phase in the Data::Seek introspection strategy is to flatten the data structure using Hash::Flatten, producing a non-hierarchical data structure where it's keys represent endpoints within the structure.
During the processing of flattening a data structure with nested data, the following data structure would be converted into a collection of endpoint/value pairs.
{ 'id' => 12345, 'patient' => { 'name' => { 'first' => 'Bob', 'last' => 'Bee' } }, 'medications' => [{ 'aceInhibitors' => [{ 'name' => 'lisinopril', 'strength' => '10 mg Tab', 'dose' => '1 tab', 'route' => 'PO', 'sig' => 'daily', 'pillCount' => '#90', 'refills' => 'Refill 3' }], 'antianginal' => [{ 'name' => 'nitroglycerin', 'strength' => '0.4 mg Sublingual Tab', 'dose' => '1 tab', 'route' => 'SL', 'sig' => 'q15min PRN', 'pillCount' => '#30', 'refills' => 'Refill 1' }], }] }
Given the aforementioned data structure, the following would be the resulting flattened structure comprised of endpoint/value pairs.
{ 'id' => 12345, 'medications:0.aceInhibitors:0.dose' => '1 tab', 'medications:0.aceInhibitors:0.name' => 'lisinopril', 'medications:0.aceInhibitors:0.pillCount' => '#90', 'medications:0.aceInhibitors:0.refills' => 'Refill 3', 'medications:0.aceInhibitors:0.route' => 'PO', 'medications:0.aceInhibitors:0.sig' => 'daily', 'medications:0.aceInhibitors:0.strength' => '10 mg Tab', 'medications:0.antianginal:0.dose' => '1 tab', 'medications:0.antianginal:0.name' => 'nitroglycerin', 'medications:0.antianginal:0.pillCount' => '#30', 'medications:0.antianginal:0.refills' => 'Refill 1', 'medications:0.antianginal:0.route' => 'SL', 'medications:0.antianginal:0.sig' => 'q15min PRN', 'medications:0.antianginal:0.strength' => '0.4 mg Sublingual Tab', 'patient.name.first' => 'Bob' 'patient.name.last' => 'Bee', }
This structure provides the endpoint strings which will be matched against using the querying strategy.
The second phase in the Data::Seek introspection strategy is to convert a criterion into a series of regular expressions to be sequentially applied, filtering/reducing the endpoints i.e. the keys of flatten data stricture using Data::Seek::Search, producing a data set of matching nodes or throwing an exception explaining the search failure.
id patient medications
The node expression is a criterion, or part of a criterion, which matches against a single node. It is a string which can contain letters, numbers, and/or underscores.
patient.name patient.name.first patient.name.last
The step expression is a criterion, or part of a criterion, made up of two or more node expressions separated using the period character, which matches against a nested nodes. It is a string which can contain letters, numbers, and/or underscores, separated using periods.
medications:0 medications:0.antianginal medications:0.antianginal:0.name
The index expression is a criterion, or part of a criterion, having a node expressions suffixed with a semi-colon followed by a number denoting that it should only match an array which has an index corresponding to the numeric portion of the suffix. It is a string which can contain letters, numbers, and/or underscores, suffixed with a semi-colon followed by a number.
medications.@ medications.@.antianginal medications.@.antianginal.@.name
The iteration expression is a criterion, or part of a criterion, having a node expressions immediately followed in-step with an "at" character (ampersand) serving as a succeeding node expression suffixed denoting that it should match all elements of all matching arrays. It is a string which can contain letters, numbers, and/or underscores, followed in-step with a node expression whose string is a single ampersand character.
* *.*.first *.*.first patient.*.first patient.*.last
The wildcard expression is a criterion, or part of a criterion, which matches against a single node having a single "star" character match and represent one or more non-period characters. It is a string which can contain letters, numbers, underscores, and/or a single star character.
** patient.** *.@.**
The greedy-wildcard expression is a criterion, or part of a criterion, which matches against any multitude of nodes having a double "star" character match and represent zero or more of any character. It is a string which can contain letters, numbers, underscores, and/or a double star character.
Al Newkirk <anewkirk@ana.io>
This software is copyright (c) 2014 by Al Newkirk.
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.
To install Data::Seek, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Data::Seek
CPAN shell
perl -MCPAN -e shell install Data::Seek
For more information on module installation, please visit the detailed CPAN module installation guide.