The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

DBIx::DataModel::Doc::Quickstart - Get quickly started with DBIx::DataModel

DOCUMENTATION CONTEXT

This chapter is part of the DBIx::DataModel manual.

This chapter is a tutorial that shows the main steps to get started with DBIx::DataModel. The goal here is conciseness, not completeness; a full reference is given in the REFERENCE chapter.

The use case for this tutorial is the same as the one shown in the SYNOPSIS, namely a small human resources management system.

BASIC ASSUMPTIONS

Before starting with DBIx::DataModel, you should have installed CPAN modules DBI and SQL::Abstract::More. You also need a database management system (DBMS) with a DBD driver. Many such systems are available, either commercial or free; if you don't know where to start, have a look at DBD::SQLite, which is open source and very easy to install.

Use your database modeling tool to create some tables for employees, departments, activities (an employee working in a department from a start date to an end date), and employee skills. If you have no modeling tool, you can also feed something like the following SQL code to the database

  CREATE TABLE t_employee (
    emp_id     INTEGER AUTO_INCREMENT PRIMARY KEY,
    lastname   TEXT    NOT NULL,
    firstname  TEXT,
    d_birth    DATE 
  ); 
  CREATE TABLE t_department (
    dpt_code   VARCHAR(5) PRIMARY KEY,
    dpt_name   TEXT    NOT NULL 
  );
  CREATE TABLE t_activity (
    act_id     INTEGER AUTO_INCREMENT PRIMARY KEY,
    emp_id     INTEGER NOT NULL REFERENCES t_employee(emp_id),
    dpt_code   VARCHAR(5) NOT NULL REFERENCES t_department(dpt_code),
    d_begin    DATE    NOT NULL,
    d_end      DATE
  );
  CREATE TABLE t_skill (
    skill_code VARCHAR(2) PRIMARY KEY,
    skill_name TEXT    NOT NULL 
  );
  CREATE TABLE t_employee_skill (
    emp_id         INTEGER NOT NULL REFERENCES t_employee(emp_id),
    skill_code     VARCHAR(2)  NOT NULL REFERENCES t_skill(skill_code),
    CONSTRAINT PRIMARY KEY (emp_id, skill_code)
  );

As can be seen from this SQL, we assume that the primary keys for t_employee and t_activity are generated automatically by the RDBMS. Primary keys for other tables are character codes and therefore should be supplied by the client program. We decided to use the suffixes _id for auto-generated keys, and _code for user-supplied codes.

All examples in this document use single-schema mode, which is the simplest way to work with DBIx::DataModel. If your application needs to simultaneously work with several databases that share the same schema structure (like for example when transferring data between a production environment and a development environment), then you need to work in multi-schema mode; explanations are provided in "SCHEMA METHODS" in DBIx::DataModel::Doc::Reference.

DECLARE SCHEMA AND TABLES

DBIx::DataModel needs to acquire some knowledge about the datamodel. The rest of this chapter will go through the steps to write the necessary declarations. Part of this work can be automated through DBIx::DataModel::Schema::Generator, which is able to inspect a given database connection to gain information about tables, primary keys and foreign key constraints, and from there produce a schema skeleton. For this tutorial, we will not use the schema generator and write declarations manually; as you will see, this is not much work, as the syntax is quite concise.

First load DBIx::DataModel :

  use DBIx::DataModel;

Now define a schema :

  DBIx::DataModel->Schema('HR');

Here we have chosen a simple acronym HR as the schema name, but it could as well have been something like Human::Resources. The schema name should not conflict with names of modules in your Perl installation; so it is probably a good idea to use a prefix like My::Company::Schema.

The schema now is a Perl class, so we can invoke its Table method to declare the first table within the schema :

  HR->Table(qw/Employee      t_employee        emp_id/);

This creates a new Perl class named HR::Employee (the schema name HR has been automatically prepended before the table name). The second argument t_employee is the database table, and the third argument emp_id is the primary key. So far nothing is declared about other columns in the table.

Other tables are declared in a similar fashion :

  HR->Table(qw/Department    t_department      dpt_code/)
    ->Table(qw/Activity      t_activity        act_id/)
    ->Table(qw/Skill         t_skill           skill_code/)
    ->Table(qw/EmployeeSkill t_employee_skill  emp_id  skill_code/);

Observe that declarations can be chained because each of them returns the schema again.

The last declaration has 4 arguments because the primary key ranges over 2 columns.

DECLARE COLUMN TYPES

RDBMS usually require that dates be in ISO format of shape yyyy-mm-dd. Let's assume our users are European and want to see and enter dates of shape dd.mm.yyyy. Insert of converting back and forth within the client code, it's easier to do it at the ORM level. So we define conversion routines within a "Date" column type

  HR->Type(Date => 
     from_DB  => sub {$_[0] =~ s/(\d\d\d\d)-(\d\d)-(\d\d)/$3.$2.$1/   if $_[0]},
     to_DB    => sub {$_[0] =~ s/(\d\d)\.(\d\d)\.(\d\d\d\d)/$3-$2-$1/ if $_[0]},
     validate => sub {$_[0] =~ m/\d\d\.\d\d\.\d\d\d\d/},
   );

and then apply this type to the appropriate columns, by calling the define_column_type() method on the meta-table objects associated with our tables.

  HR::Employee->metadm->define_column_type(Date => qw/d_birth/);
  HR::Activity->metadm->define_column_type(Date => qw/d_begin d_end/);

Such declarations start with the name of the type, followed by the list of columns to which this type is applied.

Another way to use column types is to apply them at the level of statements; this is useful for example with aggregation functions :

  my $date_youngest = HR::Employee->select(
    -columns      => 'max(d_birth)|youngest',
    -column_types => {Date => [qw/youngest/]},
   );

In this date example we just performed scalar conversions; another design choice could be to "inflate" the data to DateTime objects (or to any other Perl module working with dates); this is demonstrated in the cookbook.

DECLARE ASSOCIATIONS

Basic associations

Now we will declare a binary association between departements and activities:

  HR->Association([qw/Department department  1 /],
                  [qw/Activity   activities  * /]);

The Association method takes two references to lists of arguments; each of them contains the name of the table class, a role name, a multiplicity, and optionally the names of columns participating in the join. Here column names are not specified, so the method assumes that the join is on dpt_code (from the primary key of the class with multiplicity 1 in the association). This declaration corresponds to the following UML diagram :

  +----------------+                            +--------------+
  |                | 1                        * |              |
  | HR::Department +----------------------------+ HR::Activity |
  |                | department      activities |              |
  +----------------+                            +--------------+

The declaration should be read crosswise : it states that a department may be associated with several activities -- therefore the HR::Department class will contain an activities method which returns an arrayref -- and conversely, an activity is associated with exactly one department -- so the HR::Activity class will contain a department method which returns a single instance of HR::Department.

The two arrayrefs in the Association declaration could as well be given in the reverse order : the effect would be exactly the same, because associations are symmetric.

Choosing role names

Technically, a role name can be any valid Perl identifier, so it can be chosen arbitrarily; however, in most cases it is a good idea to use a name reminding of the associated table : this will make it easier to follow method calls and joins that navigate between tables. In addition, it is also a good idea to use role names in singular when the multiplicity is 1, and in plural when the multiplicity is greater than 1, again for readability reasons.

One exception to this recommendation is when there are several associations between the same tables, in which case role names are precisely useful to make the distinction. For example, suppose that each department has offices in one or several buildings, and has its address in one building: this could be modeled with one pair of roles office_buildings / departments_using, and one other pair of roles address_building / departments_addressed_at.

Compositions

The second association could be defined in a similar way; but here we will introduce the new concept of composition.

  HR->Composition([qw/Employee   employee    1 /],
                  [qw/Activity   activities  * /]);

This looks exactly like an association declaration; but it states that

an activity somehow "belongs" to an employee (cannot exist without being attached to an employee). In a UML class diagram, this would be pictured with a black diamond on the Employee side. In this particular example, the usage of a composition instead of an association would perhaps be debated by some data modelers; but we made this choice for the sake of the example.

A composition behaves in all respects like an association, but it has additional functionalities for the insert and delete methods; see the reference manual for details.

Many-to-many associations

Now comes the association between employees and skills, which is a a many-to-many association. This requires an intermediate linking table that will store pairs (emp_id, skill_code), so we start by declaring associations with the linking table, in the usual way :

  HR->Association([qw/Employee      employee   1 /],
                  [qw/EmployeeSkill emp_skills * /]);

  HR->Association([qw/Skill         skill      1 /],
                  [qw/EmployeeSkill emp_skills * /]);

Then we declare the many-to-many association:

  HR->Association([qw/Employee  employees  *  emp_skills employee/],
                  [qw/Skill     skills     *  emp_skills skill   /]);

This looks almost exactly like the previous declarations, except that the last arguments are no longer column names, but rather role names: these are the sequences of roles to follow in order to implement the association. This example is just an appetizer; more explanations are provided in "Many-to-many associations" in DBIx::DataModel::Doc::Design.

USE THE SCHEMA

Connecting to the database

To use the schema, we first need to provide it with a database connection :

  my $dbh = DBI->connect(...); # parameters according to your RDBMS
  HR->dbh($dbh);               # give $dbh handle to the schema

Inserting data

Basic insert() method

Now we can start populating the database:

  my ($bach_id, $berlioz_id, $monteverdi_id)
    = HR->table('Employee')->insert(
        [qw/ firstname    lastname   /],
        [qw/ Johann       Bach       /],
        [qw/ Hector       Berlioz    /],
        [qw/ Claudio      Monteverdi /],
    );

This form of insert() is convenient for inserting a bunch of rows at once : it takes as arguments a first arrayref of column names, followed by a list of arrayrefs containing values for each row.

The other form of insert() is to supply one or several hashrefs, where each hashref corresponds to a record to create :

  my ($bach_id, $berlioz_id, $monteverdi_id)
    = HR->table('Employee')->insert(
      {firstname => "Johann",  lastname => "Bach"      },
      {firstname => "Hector",  lastname => "Berlioz"   },
      {firstname => "Claudio", lastname => "Monteverdi"},
    );

The result is the same in both cases.

In this example, it is assumed that keys are generated automatically within the database (see the AUTO_INCREMENT clause in the "BASIC ASSUMPTIONS" section); therefore they need not be supplied here. The return value of the method is the list of ids generated by the database.

Next we create some departments and skills, here with explicit primary keys, and using both insertion syntaxes :

  HR->table('Department')->insert(
    {dpt_code => "CPT",  dpt_name => "Counterpoint" },
    {dpt_code => "ORCH", dpt_name => "Orchestration"},
  );

  HR->table('Skill')->insert(
    [qw/ skill_code  skill_name /],
    [qw/ VL          Violin     /],
    [qw/ KB          Keyboard   /],
    [qw/ GT          Guitar     /],
   );

insert_into_*() methods

For inserting data into the Activity table, instead of addressing the table directly, we can take advantage of the insert_into_activities in the associated Employee class :

  my $bach = HR->table('Employee')->fetch($bach_id); 
  
  $bach->insert_into_activities({d_begin => '01.01.1695',
                                 d_end   => '18.07.1750',
                                 dpt_code => 'CPT'});

In addition to the columns explicitly listed above, this method automatically adds the foreign key emp_id => $bach_id that will link the activity to the $bach employee. The same can be done for employee skills :

  $bach->insert_into_emp_skills({skill_code => 'VL'},
                                {skill_code => 'KB'});

Cascaded inserts

Since there is a composition between classes Employee and Activity, we can supply a whole data tree to the insert() method, and cascaded inserts will be performed automatically :

  HR->table('Employee')->insert(
    {firstname  => "Richard",  
     lastname   => "Strauss",
     activities => [ {d_begin  => '01.01.1874',
                      d_end    => '08.09.1949',
                      dpt_code => 'ORCH'      } ]}
  );

For retrieving the keys of records generated by those insertions, we can use the -returning => {} option :

  my $data = {firstname  => "Richard",  
              lastname   => "Strauss",
              activities => [ {d_begin  => '01.01.1874',
                               d_end    => '08.09.1949',
                               dpt_code => 'ORCH'      } ]};
  my $ids = HR->table('Employee')->insert($data, -returning => {});
  # ids now contains : { emp_id     => ..., 
  #                      activities => [{act_id => ...}]};

Updating data

The update() method can be used either as a class method, like this :

  HR->table('Employee')->update($bach_id => {firstname => "Johann Sebastian"});

or as an instance method, like this :

   my $bach = HR->table('Employee')->fetch($bach_id);
   $bach->update({firstname => "Johann Sebastian"});

Using named parameters, the class method can also update several records in one single instruction :

  HR->table('Employee')->update(
    -set   => {retired => 'true'     },
    -where => {age     => {">" => 65}},
    );

Deleting data

The deleting() method can be used either as a class method, like this :

  HR->table('Employee')->delete($bach_id);

or as an instance method, like this :

   my $bach = HR->table('Employee')->fetch($bach_id);
   $bach->delete;

Using named parameters, the class method can also delete several records in one single instruction :

  HR->table('Employee')->delete(-where => { age => {">" => 65} }

Selecting data

Basic select()

The select() method retrieves records from a class :

  my $all_employees = HR->table('Employee')->select; 
  foreach my $emp (@$all_employees) {
    do_something_with($emp);
  }

That method can take arguments to specify various aspects of the SQL request to generate, like for example the list of columns, the filtering conditions or the ordering to apply :

  my @columns  = qw/firstname lastname/;
  my %criteria = (lastname => {-like => 'B%'});
  my $some_employees = HR->table('Employee')->select(
    -columns  => \@columns,
    -where    => \%criteria,
    -order_by => 'd_birth',
   );

Selecting from a join

Instead of selecting from a single table, we can select from a join of several tables :

  my $results = HR->join(qw/Employee activities department/)->select(...);

Results from this join will be instances of a subclass that inherits from Employee and Activity and Department : therefore all methods of all parent classes are available.

Selecting through path methods

Since an association was declared between tables Employee and Skill, the Employee class has a path method named skills that automatically selects all skills related to a given employee :

  foreach my $emp (@$all_employees) {
    print "$emp->{firstname} $emp->{lastname} ";
    my @skill_names = map {$_->{skill_name}  }} @{$emp->skills};
    print " has skills ", join(", ", @skill_names) if @skill_names;
  }

Path methods are nothing but wrappers around the basic select() method, so they can take exactly the same arguments. Here is an example with the activities method in class Employee :

  my @columns = qw/d_begin d_end/;
  my %criteria = (d_end => undef);
  my $current_activities = $some_emp->activities(-columns => \@columns,
                                                 -where   => \%criteria);

And it is possible to join path methods, starting from an initial object :

  my $result = $emp->join(qw/activities department/)
                   ->select(-columns => \@columns,
                            -where   => \%criteria);

CONCLUSION

This concludes our short tutorial.

Detailed descriptions of all constructs are given in the Reference chapter; explanations of the overall architecture of DBIx::DataModel are given in the Design chapter.