CohortExplorer::Datasource - CohortExplorer datasource superclass
# The code below shows methods your datasource class overrides package CohortExplorer::Application::My::Datasource; use base qw( CohortExplorer::Datasource ); sub authenticate { my ($self, $opts) = @_; # authentication code... # Successful authentication returns a scalar response (e.g. project_id) return $response } sub additional_params { my ($self, $opts, $response) = @_; my %params; # Get database handle (i.e. $self->dbh) and run some SQL queries to get additional parameters # to be used in entity/variable/table structure hooks return \%params; } sub entity_structure { my ($self) = @_; my %struct = ( -columns => { entity_id => 'd.record', variable => 'd.field_name', value => 'd.value', table => 'm.form_name' }, -from => [ -join => qw/data|d <=>{project_id=project_id} metadata|m/ ], -where => { 'd.project_id' => $self->project_id } ); return \%struct; } sub table_structure { my ($self) = @_; return { -columns => { table => 'GROUP_CONCAT( DISTINCT form_name )', variable_count => 'COUNT( field_name )', label => 'element_label' }, -from => 'metadata'', -where => { project_id => $self->project_id }, -order_by => 'field_order', -group_by => 'form_name' }; } sub variable_structure { my ($self) = @_; return { -columns => { variable => 'field_name', table => 'form_name', label => 'element_label', type => "IF( element_validation_type IS NULL, 'text', element_validation_type)", category => "IF( element_enum like '%, %', REPLACE( element_enum, '\\\\n', '\n'), '')" }, -from => 'metadata', -where => { project_id => $self->project_id }, -order_by => 'field_order' }; } sub datatype_map { return { int => 'signed', float => 'decimal', date_dmy => 'date', date_mdy => 'date', date_ymd => 'date', datetime_dmy => 'datetime' }; }
CohortExplorer::Datasource is the base class for all datasources. When connecting CohortExplorer to EAV repositories other than Opal (OBiBa) and REDCap the user is expected to create a class which inherits from CohortExplorer::Datasource. The datasources stored in Opal and REDCap can be queried using the in-built Opal and REDCap API (see here).
CohortExplorer::Datasource is an abstract factory; initialize() is the factory method that constructs and returns an object of the datasource supplied as an application option. This class reads the datasource configuration from the config file datasource-config.properties to instantiate the datasource object. A sample config file is shown below:
initialize()
datasource-config.properties
<datasource Medication_Participant> namespace=CohortExplorer::Application::Opal::Datasource url=http://opal_home entity_type=Participant dsn=DBI:mysql:database=opal;host=hostname;port=3306 username=database_username password=database_password </datasource> <datasource Medication_Instrument> namespace=CohortExplorer::Application::Opal::Datasource url=http://opal_home entity_type=Instrument dsn=DBI:mysql:database=opal;host=hostname;port=3306 username=database_username password=database_password name=datasourceA </datasource> <datasource Drug_A> namespace=CohortExplorer::Application::REDCap::Datasource url=http://redcap_home dsn=DBI:mysql:database=opal;host=myhost;port=3306 arm_name=Drug A username=database_username password=database_password name=Drug </datasource> <datasource Drug_B> namespace=CohortExplorer::Application::REDCap::Datasource url=http://redcap_home dsn=DBI:mysql:database=opal;host=myhost;port=3306 arm_name=Drug B username=database_username password=database_password name=Drug </datasource>
Each block holds a unique datasource configuration. In addition to reserve parameters namespace, name, dsn, username, password and static_tables it is up to the user to decide what other parameters they want to include in the configuration file. If the block name is an alias the user can specify the actual name of the datasource using name parameter. If name parameter is not found the block name is assumed to be the actual name of the datasource. In the example above, both Medication_Participant and Medication_Instrument connect to the same datasource (Medication) but with different configurations. Medication_Participant is configured to query the participant data where as, Medication_Instrument can be used to query the instrument data. Similarly Drug A and Drug B are configured to query different arms of REDCap datasource Drug. Once the class has instantiated the datasource object, the user can access the parameters by simply calling the accessors which have the same name as the parameters. For example, the datasource name can be retrived by $self->name and entity_type by $self->entity_type.
namespace
name
dsn
username
password
static_tables
Medication_Participant
Medication_Instrument
Medication
Drug A
Drug B
Drug
$self->name
$self->entity_type
The namespace is the full package name of the in-built API the application will use to consult the parent EAV schema. The parameters present in the configuration file can be used by the subclass hooks to provide user or project specific functionality.
$object = $ds_pkg->new();
Basic constructor.
After instantiating the datasource object, the class first calls authenticate to perform the user authentication. If the authentication is successful (i.e. returns a defined $response), it sets some additional parameters, if any ( via additional_params). The subsequent steps include calling methods; entity_structure, table_structure, variable_structure, datatype_map and validating the return by each method. Upon successful validation the class attempts to set entity, table and variable specific parameters by invoking the methods below:
$response
This method attempts to retrieve the entity parameters such as entity_count and visit_info (if applicable) from the database. The method accepts the input from entity_structure method.
entity_count
visit_info
This method attempts to retrieve data on table and table attributes from the database. The method accepts the input from table_structure method.
This method attempts to retrieve data on variable and variable attributes from the database. The method accepts the input from variable_structure method.
The subclasses override the following hooks:
This method should return a scalar response upon successful authentication otherwise return undef. The method is called with one parameter, $opts which is a hash with application options as keys and their user-provided values as hash values. Note the methods below are only called if the authentication is successful.
undef
$opts
This method should return a hash ref containing parameter name-value pairs. Not all parameter values are known in advance so they can not be specified in the datasource configuration file. Sometimes the value of some parameter first needs to be retrieved from the database (e.g. variables and records a given user has access to). This hook can be used specifically for this purpose. The user can use the database handle ($self->dbh) and run some SQL queries to retrieve parameter name-value pairs which can then be added to the datasource object. The parameters used in calling this method are:
$self->dbh
$opts a hash with application options as keys and their user-provided values as hash values.
$response a scalar received upon successful authentication. The user may want to use the scalar response to fetch other parameters (if any).
The method should return a hash ref defining the entity structure in the database. The hash ref must have the following keys:
entity_id
variable
value
table
visit (valid to longitudinal datasources)
visit
table specifications (see SQL::Abstract::More)
where clauses (see SQL::Abstract)
column used to order the visits (valid to longitudinal datasources)
The method should return a hash ref defining the table structure in the database. table in this context implies questionnaires or forms. For example,
{ -columns => [ table => 'GROUP_CONCAT( DISTINCT form_name )', variable_count => 'COUNT( field_name )', label => 'element_label' ], -from => 'metadata', -where => { project_id => $self->project_id }, -order_by => 'field_order', -group_by => 'form_name' }
the user should make sure the SQL query constructed from hash ref is able to produce the output like the one below:
+-------------------+-----------------+------------------+ | table | variable_count | label | +-------------------+-----------------+------------------+ | demographics | 26 | Demographics | | baseline_data | 19 | Baseline Data | | month_1_data | 20 | Month 1 Data | | month_2_data | 20 | Month 2 Data | | month_3_data | 28 | Month 3 Data | | completion_data | 6 | Completion Data | +-------------------+-----------------+------------------+
Note -columns hash must contain table definition. It is up to the user to decide what table attributes they think are suitable for the description of tables.
-columns
This method should return a hash ref defining the variable structure in the database. For example,
{ -columns => [ variable => 'field_name', table => 'form_name', label => 'element_label', category => "IF( element_enum like '%, %', REPLACE( element_enum, '\\\\n', '\n'), '')", type => "IF( element_validation_type IS NULL, 'text', element_validation_type)" ], -from => 'metadata', -where => { project_id => $self->project_id }, -order_by => 'field_order' }
the user should make sure the SQL query constructed from the hash ref is able to produce the output like the one below:
+---------------------------+---------------+-------------------------+---------------+----------+ | variable | table |label | category | type | +---------------------------+---------------+-------------------------+--------------------------- | kt_v_b | baseline_data | Kt/V | | float | | plasma1_b | baseline_data | Collected Plasma 1? | 0, No | text | | | | | 1, Yes | | | date_visit_1 | month_1_data | Date of Month 1 visit | | date_ymd | | alb_1 | month_1_data | Serum Albumin (g/dL) | | float | | prealb_1 | month_1_data | Serum Prealbumin (mg/dL)| | float | | creat_1 | month_1_data | Creatinine (mg/dL) | | float | +---------------------------+---------------+-----------+-------------------------------+--------+
Note -columns hash must define variable and table columns. Again it is up to the user to decide what variable attributes they think define the variables in the datasource. The categories within category column must be separated by newline.
category
This method should return a hash ref with value types as keys and equivalent SQL types (i.e. castable) as hash values. For example,
{ 'int' => 'signed', 'float' => 'decimal', 'number_1dp' => 'decimal(10,1)', 'datetime' => 'datetime' }
Config::General fails to parse the datasource configuration file.
Failed to instantiate datasource package '<datasource pkg>' via new().
Return by methods additional_params, entity_structure, table_structure, variable_structure and datatype_map is either not hash-worthy or contains missing columns.
additional_params
entity_structure
table_structure
variable_structure
datatype_map
select method in SQL::Abstract::More fails to construct the SQL query from the supplied hash ref.
select
execute method in DBI fails to execute the SQL query.
execute
Carp
CLI::Framework::Exceptions
Config::General
DBI
Exception::Class::TryCatch
SQL::Abstract::More
Tie::IxHash
CohortExplorer
CohortExplorer::Application::Opal::Datasource
CohortExplorer::Application::REDCap::Datasource
CohortExplorer::Command::Describe
CohortExplorer::Command::Find
CohortExplorer::Command::History
CohortExplorer::Command::Query::Search
CohortExplorer::Command::Query::Compare
Copyright (c) 2013-2014 Abhishek Dixit (adixit@cpan.org). All rights reserved.
This program is free software: you can redistribute it and/or modify it under the terms of either:
the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version, or
the " Artistic Licence ".
Abhishek Dixit
To install CohortExplorer, copy and paste the appropriate command in to your terminal.
cpanm
cpanm CohortExplorer
CPAN shell
perl -MCPAN -e shell install CohortExplorer
For more information on module installation, please visit the detailed CPAN module installation guide.