The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Data::Seek::Concepts - Data::Seek Concepts

VERSION

version 0.06

OVERVIEW

This document contains a simple overview of the strategy and syntax used by Data::Seek to query complex data strictures. The overall idea behind Data::Seek is to flatten/fold the data structure once, then reduce it by applying a series patterns.

FLATTENING

The first phase in the Data::Seek introspection strategy is to flatten the data structure using Hash::Flatten, producing a non-hierarchical data structure where it's keys represent endpoints within the structure.

Encoding

During the processing of flattening a data structure with nested data, the following data structure would be converted into a collection of endpoint/value pairs.

    {
        'id' => 12345,
        'patient' => {
            'name' => {
                'first' => 'Bob',
                'last'  => 'Bee'
            }
        },
        'medications' => [{
            'aceInhibitors' => [{
                'name'      => 'lisinopril',
                'strength'  => '10 mg Tab',
                'dose'      => '1 tab',
                'route'     => 'PO',
                'sig'       => 'daily',
                'pillCount' => '#90',
                'refills'   => 'Refill 3'
            }],
            'antianginal' => [{
                'name'      => 'nitroglycerin',
                'strength'  => '0.4 mg Sublingual Tab',
                'dose'      => '1 tab',
                'route'     => 'SL',
                'sig'       => 'q15min PRN',
                'pillCount' => '#30',
                'refills'   => 'Refill 1'
            }],
        }]
    }

Given the aforementioned data structure, the following would be the resulting flattened structure comprised of endpoint/value pairs.

    {
        'id' => 12345,
        'medications:0.aceInhibitors:0.dose' => '1 tab',
        'medications:0.aceInhibitors:0.name' => 'lisinopril',
        'medications:0.aceInhibitors:0.pillCount' => '#90',
        'medications:0.aceInhibitors:0.refills' => 'Refill 3',
        'medications:0.aceInhibitors:0.route' => 'PO',
        'medications:0.aceInhibitors:0.sig' => 'daily',
        'medications:0.aceInhibitors:0.strength' => '10 mg Tab',
        'medications:0.antianginal:0.dose' => '1 tab',
        'medications:0.antianginal:0.name' => 'nitroglycerin',
        'medications:0.antianginal:0.pillCount' => '#30',
        'medications:0.antianginal:0.refills' => 'Refill 1',
        'medications:0.antianginal:0.route' => 'SL',
        'medications:0.antianginal:0.sig' => 'q15min PRN',
        'medications:0.antianginal:0.strength' => '0.4 mg Sublingual Tab',
        'patient.name.first' => 'Bob'
        'patient.name.last' => 'Bee',
    }

This structure provides the endpoint strings which will be matched against using the querying strategy.

QUERYING

The second phase in the Data::Seek introspection strategy is to convert a criterion into a series of regular expressions to be sequentially applied, filtering/reducing the endpoints i.e. the keys of flatten data stricture using Data::Seek::Search, producing a data set of matching nodes or throwing an exception explaining the search failure.

Node Expression

    id
    patient
    medications

The node expression is a criterion, or part of a criterion, which matches against a single node. It is a string which can contain letters, numbers, and/or underscores.

Step Expression

    patient.name
    patient.name.first
    patient.name.last

The step expression is a criterion, or part of a criterion, made up of two or more node expressions separated using the period character, which matches against a nested nodes. It is a string which can contain letters, numbers, and/or underscores, separated using periods.

Index Expression

    medications:0
    medications:0.antianginal
    medications:0.antianginal:0.name

The index expression is a criterion, or part of a criterion, having a node expressions suffixed with a semi-colon followed by a number denoting that it should only match an array which has an index corresponding to the numeric portion of the suffix. It is a string which can contain letters, numbers, and/or underscores, suffixed with a semi-colon followed by a number.

Iterator Expression

    medications.@
    medications.@.antianginal
    medications.@.antianginal.@.name

The iteration expression is a criterion, or part of a criterion, having a node expressions immediately followed in-step with an "at" character (ampersand) serving as a succeeding node expression suffixed denoting that it should match all elements of all matching arrays. It is a string which can contain letters, numbers, and/or underscores, followed in-step with a node expression whose string is a single ampersand character.

Wildcard Expression

    *
    *.*.first
    *.*.first
    patient.*.first
    patient.*.last

The wildcard expression is a criterion, or part of a criterion, which matches against a single node having a single "star" character match and represent one or more non-period characters. It is a string which can contain letters, numbers, underscores, and/or a single star character.

Greedy-Wildcard Expression

    **
    patient.**
    *.@.**

The greedy-wildcard expression is a criterion, or part of a criterion, which matches against any multitude of nodes having a double "star" character match and represent zero or more of any character. It is a string which can contain letters, numbers, underscores, and/or a double star character.

AUTHOR

Al Newkirk <anewkirk@ana.io>

COPYRIGHT AND LICENSE

This software is copyright (c) 2014 by Al Newkirk.

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.