Array::To::Moose - Build Moose objects from a data array
This document describes Array::To::Moose version 0.0.9
use Array::To::Moose; # or use Array::To::Moose qw(array_to_moose set_class_ind set_key_ind throw_nonunique_keys throw_multiple_rows );
Array::To::Moose exports function array_to_moose() by default, and convenience functions set_class_ind(), set_key_ind(), throw_nonunique_keys() and throw_multiple_rows() if requested.
Array::To::Moose
array_to_moose()
set_class_ind()
set_key_ind()
throw_nonunique_keys()
throw_multiple_rows()
array_to_moose() builds Moose objects from suitably-sorted 2-dimensional arrays of data of the type returned by, e.g., DBI::selectall_arrayref() i.e. a reference to an array containing references to an array for each row of data fetched.
package Car; use Moose; has 'make' => (is => 'ro', isa => 'Str'); has 'model' => (is => 'ro', isa => 'Str'); has 'year' => (is => 'ro', isa => 'Int'); package CarOwner; use Moose; has 'last' => (is => 'ro', isa => 'Str'); has 'first' => (is => 'ro', isa => 'Str'); has 'Cars' => (is => 'ro', isa => ArrayRef[Car]'); ... # in package main: use Array::To::Moose; # In this dataset Alex owns two cars, Jim one, and Alice three my $data = [ [ qw( Green Alex Ford Focus 2011 ) ], [ qw( Green Alex VW Jetta 2009 ) ], [ qw( Green Jim Honda Civic 2007 ) ], [ qw( Smith Alice Buick Regal 2012 ) ], [ qw( Smith Alice Toyota Camry 2008 ) ], [ qw( Smith Alice BMW X5 2010 ) ], ]; my $CarOwners = array_to_moose( data => $data, desc => { class => 'CarOwner', last => 0, first => 1, Cars => { class => 'Car', make => 2, model => 3, year => 4, } # Cars } # Car Owners ); print $CarOwners->[2]->Cars->[1]->model; # prints "Camry"
In the above example, array_to_moose() returns a reference to an array of CarOwner objects, $CarOwners.
CarOwner
$CarOwners
If a hash of CarOwner objects is required, a "key =>... " entry must be added to the descriptor hash. For example, to construct a hash of CarOwner objects, whose key is the owner's first name, (unique for every person in the example data), the call becomes:
key =>
my $CarOwnersH = array_to_moose( data => $data, desc => { class => 'CarOwner', key => 1, # note key last => 0, first => 1, Cars => { class => 'Car', make => 2, model => 3, year => 4, } # Cars } # Car Owners ); print $CarOwnersH->{Alex}->Cars->[0]->make; # prints "Ford"
Similarly, to construct the Cars sub-objects as hash sub-objects (and not an array as above), define CarOwner as:
Cars
package CarOwner; use Moose; has 'last' => (is => 'ro', isa => 'Str' ); has 'first' => (is => 'ro', isa => 'Str' ); has 'Cars' => (is => 'ro', isa => 'HashRef[Car]'); # Was 'ArrayRef[Car]'
and noting that the car make is unique for each person in the $data dataset, we construct the reference to an array of objects with the call:
make
$data
$CarOwners = array_to_moose( data => $data, desc => { class => 'CarOwner', last => 0, first => 1, Cars => { class => 'Car', key => 2, # note key model => 3, year => 4, } # Cars } # Car Owners ); print $CarOwners->[2]->Cars->{BMW}->model; # prints 'X5'
If, instead of the car owner object containing an ArrayRef or HashRef of Car sub-objects, it contains, say, a ArrayRef of strings representing the names of the car makers:
Car
package SimpleCarOwner; use Moose; has 'last' => (is => 'ro', isa => 'Str' ); has 'first' => (is => 'ro', isa => 'Str' ); has 'CarMakers' => (is => 'ro', isa => 'ArrayRef[Str]');
Using the same dataset from Example 1a, we construct an arrayref SimpleCarOwner objects as:
SimpleCarOwner
$SimpleCarOwners = array_to_moose( data => $data, desc => { class => 'SimpleCarOwner', last => 0, first => 1, CarMakers => [2], # Note the '[...]' brackets } ); print $SimpleCarOwners->[2]->[1]; # prints 'Toyota'
I.e., when the object attribute is an ArrayRef of one of the Moose "simple" types, e.g. 'Str', 'Num', 'Bool', etc (See Moose::Manual::Types), then the column number should appear in square brackets ('CarMakers => [2]' above) to differentiate them from the bare types (last => 0, and first => 1, above).
'Str'
'Num'
'Bool'
CarMakers => [2]
last => 0,
first => 1,
Note that Array::To::Moose doesn't (yet) handle the case of hashrefs of "simple" types, e.g., ( isa => "HashRef[Str]" )
( isa => "HashRef[Str]" )
The main rationale for writing Array::To::Moose is to make it easy to build Moose objects from data extracted from relational databases, especially when the database query involves multiple tables with one-to-many relationships to each other.
As an example, consider a database which models patients making visits to a clinic on multiple occasions, and on each visit, having a doctor run some tests and diagnose the patient's complaint. In this model, the database Patient table would have a one-to-many relationship with the Visit table, which in turn would have a one-to-many relationship with the Test table
The corresponding Moose model has nested Moose objects which reflects those one-to-many relationships, i.e., multiple Visit objects per Patient object and multiple Test objects per Visit object, declared as:
package Test; use Moose; has 'name' => (is => 'rw', isa => 'Str'); has 'result' => (is => 'rw', isa => 'Str'); package Visit; use Moose; has 'date' => (is => 'rw', isa => 'Str' ); has 'md' => (is => 'rw', isa => 'Str' ); has 'diagnosis' => (is => 'rw', isa => 'Str' ); has 'Tests' => (is => 'rw', isa => 'HashRef[Test]' ); package Patient; use Moose; has 'last' => (is => 'rw', isa => 'Str' ); has 'first' => (is => 'rw', isa => 'Str' ); has 'Visits' => (is => 'rw', isa => 'ArrayRef[Visit]' );
In the main program:
use DBI; use Array::To::Moose; ... my $sql = q{ SELECT P.Last, P.First ,V.Date, V.Doctor, V.Diagnosis ,T.Name, T.Result FROM Patient P ,Visit V ,Test T WHERE -- join clauses P.Patient_key = V.Patient_key AND V.Visit_key = T.Visit_key ... ORDER BY P.Last, P.First, V.Date }; my $dbh = DBI->connect(...); my $data = $dbh->selectall_arrayref($sql); # rows of @$data contain: # Last, First, Date, Doctor, Diagnosis, Name, Result # at positions: [0] [1] [2] [3] [4] [5] [6] my $patients = array_to_moose( data => $data, desc => { class => 'Patient', last => 0, first => 1, Visits => { class => 'Visit', date => 2, md => 3, diagnosis => 4, Tests => { class => 'Test', key => 5, name => 5, result => 6, } # tests } # visits } # patients ); print $patients->[2]->Visits->[0]->Tests->{BP}->result; # prints '120/80'
Note: We used the Test name as the key for the Visit 'Tests', as the tests have unique names within any one Visit. (See t/5.t)
name
Tests
As shown in the above examples, the general usage is:
package MyClass; use Moose; (define Moose object(s)) ... use Array::To::Moose; ... my $data_ref = selectall_arrayref($sql); # for example my $object_ref = array_to_moose( data => $data_ref desc => { class => 'MyClass', key => K, # only for HashRefs attrib_1 => N1, attrib_2 => N2, ... attrib_m => [ M ], ... SubObject => { class => 'MySubClass', ... } } );
Where:
array_to_moose() returns an array- or hash reference of MyClass Moose objects. All Moose classes (MyClass, MySubClass, etc) must already have been defined by the user.
MyClass
MySubClass
$data_ref is a reference to an array containing references to arrays of scalars of the kind returned by, e.g., DBI::selectall_arrayref()
$data_ref
desc (descriptor) is a reference to a hash which contains several types of data:
desc
class => 'MyObj' is required and defines the Moose class or package which will contain the data. The user should have defined this class already.
class =>
key => N is required if the Moose object being constructed is to be a hashref, either at the top-level Moose object returned from array_to_moose() or as a "isa => 'HashRef[...]'" sub-object.
key => N
isa => 'HashRef[...]'
attrib => N where attrib is the name of a Moose attribute ("has 'attrib' => ...")
attrib => N
attrib
has 'attrib' =>
attrib => [ N ] where attrib is the name of a Moose "simple" sub-attribute ("has => 'attrib' ( isa => 'ArrayRef[Type]' ...) "), where Type is a "simple" Moose type, e.g., 'Str', 'Int', etc.
attrib => [ N ]
has => 'attrib' ( isa => 'ArrayRef[Type]' ...)
Type
'Str', 'Int'
In the above cases, N is a positive integer containing the the corresponding zero-indexed column number in the data array where that attribute's data is to be found.
N
array_to_moose() can handle three types of Moose sub-objects, i.e.:
an array of sub-objects:
has => 'Sub_Obj' ( isa => 'ArrayRef[MyObj]' );
a hash of sub-objects:
has => 'Sub_Obj' ( isa => 'HashRef[MyObj]' );
or a single sub-object:
has => 'Sub_Obj' ( isa => 'MyObj' );
the descriptor entry for Sub_Obj in each of these cases is (almost) the same:
Sub_Obj
desc => { class => ... ... Sub_Obj => { class => 'MyObj', key => <keycol> # HashRef['] only attrib_a => <N>, ... } # end SubObj ... } # end desc
(A HashRef['] sub-object will also require a key => N entry in the descriptor).
HashRef[']
In addition, array_to_moose() can also handle ArrayRefs of "simple" types:
ArrayRef
has => 'Sub_Obj' ( isa => 'ArrayRef[Type]' );
where Type is a "simple" Moose type, e.g., 'Str', 'Int, 'Bool', etc.
'Str', 'Int, 'Bool'
array_to_moose() does not sort the input data array, and does all processing in a single pass through the data. This means that the data in the array must be sorted properly for the algorithm to work.
For example, in the previous Patient/Visit/Test example, in which there are many Tests per Visit and many Visits per Patient, the data in the Test column(s) must change the fastest, the Visit data slower, and the Patient data the slowest:
Patient Visit Test ------ ----- ---- P1 V1 T1 P1 V1 T2 P1 V1 T3 P1 V2 T4 P1 V2 T5 P2 V3 T6 P2 V3 T7 P2 V4 T8
In SQL this would be accomplished by a SORT BY clause, e.g.:
SORT BY
SORT BY Patient.Key, Visit.Key, Test.Key
By default, array_to_moose() does not check the uniqueness of hash key values within the data. If the key values in the data are not unique, existing hash entries will get overwritten, and the sub-object will contain the value from the last data row which contained that key value. For example:
package Employer; use Moose; has 'year' => (is => 'rw', isa => 'Str'); has 'name' => (is => 'rw', isa => 'Str'); package Person; use Moose; has 'name' => (is => 'rw', isa => 'Str' ); has 'Employers' => (is => 'rw', isa => 'HashRef[Employer]'); ... my $data = [ [ 'Anne Miller', '2005', 'Acme Corp' ], [ 'Anne Miller', '2006', 'Acme Corp' ], [ 'Anne Miller', '2007', 'Widgets, Inc' ], ... ];
The call:
my $obj = array_to_moose( data => $data, desc => { class => 'Person', name => 0, Employers => { class => 'Employer', key => 2, # using employer name as key year => 1, } # Employer } # Person );
Because the employer was 'Acme Corp' in years 2005 & 2006, array_to_moose will silently overwrite the 2005 Employer object with the data for the 2006 Employer object:
'Acme Corp'
array_to_moose
print $obj->[0]->Employers->{'Acme Corp'}->year, "\n"; # prints '2006'
Calling throw_uniq_keys() (either with no argument, or with a non-zero argument) enables reporting of non-unique keys. In the above example, array_to_moose() would exit with warning:
throw_uniq_keys()
Non-unique key 'Acme Corp' in 'Employer' class ...
Calling throw_uniq_keys(0), i.e. with an argument of zero will disable subsequent reporting of non-unique keys. (See t/8c.t)
throw_uniq_keys(0)
For single-occurence sub-objects (i.e. ( isa => 'MyObj' )), if the data contains more than one row of data for the sub-object, only the first row will be used to construct the single sub-object and array_to_moose() will not report the fact. E.g.:
( isa => 'MyObj' )
package Salary; use Moose; has 'year' => (is => 'rw', isa => 'Str'); has 'amount' => (is => 'rw', isa => 'Int'); package Person; use Moose; has 'name' => (is => 'rw', isa => 'Str' ); has 'Salary' => (is => 'rw', isa => 'Salary'); # a single object ... my $data = [ [ 'John Smith', '2005', 23_350 ], [ 'John Smith', '2006', 24_000 ], [ 'John Smith', '2007', 26_830 ], ... ];
my $obj = array_to_moose( data => $data, desc => { class => 'Person' name => 0, Salary => { class => 'Salary', year => 1, amount => 2 } # Salary } # Person );
would silently assign to Salary, the first row of the three Salary data rows, i.e. for year 2005:
Salary
print $object->[0]->Salary->year, "\n"; # prints '2005'
Calling throw_multiple_rows() (either with no argument, or with a non-zero argument) enables reporting of this situation. In the above example, array_to_moose() will exit with error:
Expected a single 'Salary' object, but got 3 of them ...
Calling throw_multiple_rows(0), i.e. with an argument of zero will disable subsequent reporting of this error. (See t/8d.t)
throw_multiple_rows(0)
Problems arise if the Moose objects being constructed contain attributes called class or key, causing ambiguities in the descriptor. (Does key => 5 mean the attribute key or the hash key key is in the 5th column?)
key => 5
key
In these cases, set_class_ind() and set_key_ind() can be used to change the keywords for class => ... and key => ... descriptor entries.
class => ...
key => ...
For example:
package Letter; use Moose; has 'address' => ( is => 'ro', isa => 'Str' ); has 'class' => ( is => 'ro', isa => 'PostalClass' ); ... set_key_ind('package'); # use "package =>" in place of "class =>" my $letters = array_to_moose( data => $data, desc => { package => 'Letter', # the Moose class address => 0, class => 1, # the attribute 'class' ... } );
One of the recommendations of Moose::Manual::BestPractices is to make attributes read-only (isa => 'ro') wherever possible. Array::To::Moose supports this by evaluating all the attributes for a given object given in the descriptor, then including them all in the call to new(...) when constructing the object.
isa => 'ro'
new(...)
For Moose objects with attributes which are sub-objects, i.e. references to a Moose object, or references to an array or hash of Moose objects, it means that the sub-objects must be evaluated before the new() call. The effect of this for multi-leveled Moose objects is that object evaluations are carried out depth-first.
new()
NULL
array_to_moose() uses Array::GroupBy::igroup_by to compare the rows in the data given in data => ..., using function Array::GroupBy::str_row_equal() which compares the data as strings.
data => ...
If the data contains undef values, typically returned from database SQL queries in which DBI maps NULL values to undef, when str_row_equal() encounters undef elements in corresponding column positions, it will consider the elements equal. When corresponding column elements are defined and undef respectively, the elements are considered unequal.
undef
str_row_equal()
equal
unequal
This truth table demonstrates the various combinations:
-------+------------+--------------+--------------+-------------- row 1 | ('a', 'b') | ('a', undef) | ('a', undef) | ('a', 'b' ) row 2 | ('a', 'b') | ('a', undef) | ('a', 'b' ) | ('a', undef) -------+------------+--------------+--------------+-------------- equal? | yes | yes | no | no
array_to_moose by default; throw_nonunique_keys, throw_multiple_rows, set_class_ind and set_key_ind if requested.
throw_nonunique_keys
throw_multiple_rows
set_class_ind
set_key_ind
Errors in the call of array-to-moose() will be caught by Params::Validate::Array, q.v.
array-to-moose()
<array-to-moose> does a lot of error checking, and is probably annoyingly chatty. Most of the errors generated are, of course, self-explanatory :-)
Carp Params::Validate::Array Array::GroupBy
DBI, Moose, Array::GroupBy
The handling of Moose type constraints is primitive.
Sam Brain <samb@stanford.edu>
Copyright (c) Stanford University. June 6th, 2010. All rights reserved. Author: Sam Brain <samb@stanford.edu>
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.8 or, at your option, any later version of Perl 5 you may have available.
To install Array::To::Moose, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Array::To::Moose
CPAN shell
perl -MCPAN -e shell install Array::To::Moose
For more information on module installation, please visit the detailed CPAN module installation guide.