The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

UR::Object::Type::Initializer - Class definition syntax

SYNOPSIS

  UR::Object::Type->define(
      class_name => 'Namespace::MyClass',
      id_by => 'my_class_id',
      has => ['prop_a', 'prop_b']
  );

  UR::Object::Type->define(
      class_name => 'Namespace::MyChildClass',
      is => 'Namespace::MyClass',
      has => [
          'greeting' => { is => 'String', is_optional => 0,
                          valid_values => ["hello","good morning","good evening"],
                          default_value => 'hello' },
      ],
  );

  UR::Object::Type->define(
      class_name => 'Namespace::Helper',
      id_by => 'helper_id',
      has => [
          'my_class_id' => { is => 'String', is_optional => 0 },
          'my_class'    => { is => 'Namespace::MyClass', id_by => 'my_class_id' },
          'my_class_a'  => { via => 'my_class', to => 'prop_a' },
      ],
      has_optional => [
          'other_attribute' => { is => 'Integer' }
      ],
      data_source => 'Namespace::DataSource::DB',
      table_name  => 'HELPERS',
  );

  UR::Object::Type->define(
      class_name => 'Namespace::Users',
      id_by => ['uid'],
      has => [ 'login','passwd','gid','name','home','shell'],
      data_source => {
          is => 'UR::DataSource::File',
          file => '/etc/passwd',
          column_order => ['login','passwd','uid','gid','name','home','shell',
          skip_first_line => 0,
          delimiter => ':'
      },
      id_generator => '-uuid',
  );

DESCRIPTION

Defining a UR class is like drawing up a blueprint of what a particular kind of object will look like. That blueprint includes the properties these objects will have, what other classes the new class inherits from, and where the source data comes from such as a database or file.

The Simplest Class

The simplest class definition would look like this:

  use UR;
  class Thing {};

You can create an instance of this class like this:

  my $thing_object = Thing->create();

Instances of this class have no properties, no backing storage location, and no inheritance.

Actually, none of those statements are fully true, but we'll come back to that later...

A Little Background

After using UR, or another class that inherits from UR::Namespace, the above "class" syntax above can be used to define a class.

The equivalent, more plain-Perl way to define a class is like this:

  UR::Object::Type->define(
     class_name => 'Thing',
     # the remainder of the class definition would go here, if there were any
  );

Classes become instances of another class called UR::Object::Type. It has a property called class_name that contains the package that instances of these objects are blessed into. Class properties are also instances of a class called UR::Object::Property, and those properties have properties (also UR::Object::Properties) that describe it, such as the type of data it holds, and its length. In fact, all the metadata about classes, properties, relationships, inheritance, data sources and contexts are available as instances of metadata classes. You can get information about any class currently available from the command line with the command

  ur show properties Thing

with the caveat that "currently available" means you need to be under a namespace directory that contains the class you're describing.

Making Something Useful

  class Vehicle {
      id_by => 'serial_number',
      has => ['color', 'weight'],
      has_optional => ['license_plate'],
  };

Here we have a basic class definition for a thing we're calling a Vehicle. It has 4 properties: serial_number, color, weight and license_plate. Three of these properties are required, meaning that when you create one, you must give a value for those properties; it is similar to a 'NOT NULL' constraint on a database column. The serial_number property is an ID property of the class, meaning that no two instances (objects) of that class can exist with the same serial_number; it is similar to having a UNIQUE index on that column (or columns) in a database. Not all vehicles have license plates, so it is optional.

After that, you've effectively created five object instances. One UR::Object::Type identified by its class_name being 'Vehicle', and four UR::Object::Property objects identified by the pairs of class_name and property_name. For these four properties, class_name is always 'Vehicle' and property name is one each of serial_number, color, weight and license_plate.

Objects always have one property that is called 'id'. If you have only one property in the id_by section, then the 'id' property is effectively an alias for it. If you have several id_by properties, then the 'id' property becomes an amalgamation of the directly named id properties such that no two objects of that class will have the same 'id'. If there are no id_by properties given (including MyClass above that doesn't have _any_ properties), then an implicit 'id' property will get created. Instances of that class will have an 'id' generated internally by an algorithm. Finally, if the class has more than one ID property, none of them may be called 'id', since that name will be reserved for the amalgamated-value property.

You can control how IDs get autogenerated with the class' id_generator metadata. For classes that save their results in a relational database, it will get new IDs from a sequence (or some equivalent mechanism for databases that do not support sequences) based on the class' table name. If you want to force the system to use some specific sequence, for example if many classes should use the same sequence generator, then put the name of this sequence in.

If the id_generator begins with a dash (-), it indicates a method should be called to generate a new ID. For example, if the name is "-uuid", then the system will call the internal method $class_meta-autogenerate_new_object_id_uuid>. nd will make object IDs as hex string UUIDs. The default value is '-urinternal' which makes an ID string composed of the hostname, process ID, the time the program was started and an increasing integer. If id_generator is a subroutine reference, then the sub will be called with the class metaobject and creation BoolExpr passed as parameters.

You'll find that the parser for class definitions is pretty accepting about the kinds of data structures it will take. The first thing after class is used as a string to name the class. The second thing is a hashref containing key/value pairs. If the value part of the pair is a single string, as the id_by is in the Vehicle class definition, then one property is created. If the value portion is an arrayref, then each member of the array creates an additional property.

Filling in the Details

That same class definition can be made this way:

  class Vehicle {
      id_by => [
          serial_number => { is => 'String', len => 25 },
      ],
      has => [
          color => { is => 'String' },
          weight => { is => 'Number' },
          license_plate => { is => 'String', len => 8, is_optional => 1 },
      ],
  };

Here we've more explicitly defined the class' properties by giving them a type. serial_number and license_number are given a maximum length, and license_number is declared as optional. Note that having a 'has_optional' section is the same as explicitly putting 'is_optional => 1' for all those properties. The same shortcut is supported for the other boolean properties of UR::Object::Property, such as is_transient, is_mutable, is_abstract, etc.

The type system is pretty lax in that there's nothing stopping you from using the method for the property to assign a string into a property declared 'Number'. Type, length, is_optional constraints are checked by calling __errors__() on the object, and indirectly when data is committed back to its data source.

Inheritance

  class Car {
      is => 'Vehicle',
      has => [
          passenger_count   => { is => 'Integer', default_value => 0 },
          transmission_type => { is => 'String',
                                 valid_values => ['manual','automatic','cvt'] },
      ],
  };

  my $car = Car->create(color => 'blue',
                        serial_number => 'abc123',
                        transmission_type => 'manual');  

Here we define another class called Car. It inherits from Vehicle, meaning that all the properties that apply to Vehicle instances also apply to Car instances. In addition, Car instances have two new properties. passenger_count has a default value of 0, and transmission_type is constrained to three possible values.

More class properties

Besides property definitions, there are other things that can be specified in a class definition.

is

Used to name the parent class(es). Single inheritance can be specified by just listing the name of the parent class as a string. Multiple inheritance is specified by an arrayref containing the parent class names. If no 'is' is listed, then the class will inherit from 'UR::Entity'

doc

A single string to list some short, useful documentation about the class.

data_source

A string to list the data source ID. For classes with no data_source, the only objects get() can return are those that had previously been instantiated with create() or define() earlier in the program, and they do not get saved anywhere during a commit(). They do, however, exist in the object cache during the program's execution.

data_source can also be a hashref to define a data source in line with the class definition. See below for more information about "Inline Data Sources".

table_name

When the class' data source is some kind of database, table_name Specifies the name of the table where this class' data is stored to.

select_hint

Some relational databases use hints as a way of telling the query optimizer to behave differently than usual. These hints are specified inside comments like this: /* the hint */ If the class is the primary class of a query, and it has a hint, then the hint will appear after the word 'select' in the SQL.

join_hint

If the class is part of a query where it is joined, then its hint will be added to the hints already part of the query. The primary table's hint will be first, followed by the joined class' hints in the order they are joined. All the hints are separated by a single space.

is_abstract

A flag indicating that no instances of this class may be instantiated, instead it is used as a parent of other classes.

sub_classification_method_name

Holds the name of a method that is called whenever new instances of the class are loaded from a data source. This method will be called with two arguments: the name of the class the get() was called on, and the object instance being loaded. The method should return the complete name of a subclass the object should be blessed into.

sub_classification_property_name

Works like 'sub_classification_method_name', except that the value of the property is directly used to subclass the loaded object.

Properties properties

ur show properties UR::Object::Property will print out an exhaustive list of all the properties of a Class Property. A class' properties are declared in the 'id_by' or one of the 'has' sections. Some of the more important ones:

class_name

The name of the class this property belongs to.

property_name

The name of the property. 'property_name' and 'class_name' do not actually appear in the hashref that defines the property inside a class definition, though they are properties of UR::Object::Property instances.

is

Specifies the data type of this property. Basic types include 'String', 'Integer', 'Float'. Relationships between classes are defined by having the name of another class here. See the Relationships section of UR::Manual::Cookbook for more information.

Object properties do not normally hold Perl references to other objects, but you may use 'ARRAY' or 'HASH' here to indicate that the object will store the reference directly. Note that these properties are not usually saveable to outside data sources.

data_type

A synonym for 'is'

len

Specifies the maximum length of the data, usually in bytes.

doc

A space for useful documentation about the property

default_value

The value a property will have if it is not specified when the object is created. If used on a property that is 'via' another property (see the Indirect Properties section below), it can trigger creation of a referent object.

calculated_default

A name of an instance method or a code ref to be used to resolve a default value during object creation. The instance method will be called immediately after UR creates the initial entity so the method will have access to other parameters used during creation.

Specifying calculated_default > 1 is equivalent to,

  calculated_default => '__default_' . $prop_name . '__'

and is meant to establish a naming convention without requiring it.

is_mutable

A flag indicating that this property can be changed. It is the default state of a property. Set this to 0 in the property definition if the property is not changeable after the object is created.

is_constant

A flag indicating that the value of this property may not be changed after the object is created. It is a synonym for having is mutable = 0

is_many

Indicates that this returns a list of values. Usually used with reverse_as properties.

is_optional

Indicates that this property can hold the value undef.

Calculated Properties

is_calculated

A flag indicating that the value of this property is determined from a function.

calculate_from

A listref of other property names used by the calculation

calculate

A specification for how the property is to be calculated in Perl.

  • if the value is a coderef, it will be called when that property is accessed, and the first argument will be the object instance being acted on.

  • the value may be a string containing Perl code that is eval-ed when the accessor is called. The Perl code can refer to $self, which will hold the correct object instance during execution of that code. Any properties listed in the 'calculate_from' list will also be initialized

  • The special value 'sum' means that the values of all the properties in the calculate_from list are added together and returned

Any property can be effectively turned into a calculated property by defining a method with the same name as the property.

Database-backed properties

column_name

For classes whose data is stored in a database table (meaning the class has a data_source), the column_name holds the name of the database column in its table. In the default case, the column_name is the same as the 'property_name'.

calc_sql

Specifies that this property is calculated, and its value is a string containing SQL code inserted into that property's "column" in the SELECT clause

Relation Properties

Some properties are not used to hold actual data, but instead describe some kind of relationship between two classes. For example:

  class Person {
      id_by => 'person_id',
      has => ['name'],
  };
  class Thing {
      id_by => 'thing_id',
      has => [
          owner => { is => 'Person', id_by => 'owner_id' },
      ],
  };
  $person = Person->create(person_id => 1, name => 'Bob');
  $thing = Thing->create(thing_id => 2, owner_id => 1);

Here, Thing has a property called owner. It implicitly defines a property called owner_id. owner becomes a read-only property that returns an object of type Person by using the object's value for the owner_id property, and looking up a Person object where its ID matches. In the above case, $thing->owner will return the same object that $person contains.

Indirect properties can also link classes with multiple ID properties.

  class City {
      id_by => ['name', 'state']
  };
  class Location {
      has => [
         city    => { is => 'String' },
         state   => { is => 'String' },
         cityobj => { is => 'City',
                      id_by => ['city', 'state' ] },
      ],
  };

Note that the order the properties are linked must match in the relationship property's id_by and the related class's id_by

Reverse Relationships

When one class has a relation property to another, the target class can also define the converse relationship. In this case, OtherClass is the same as the first "Relation Properties" example where the relationship from OtherClass to MyClass, but we also define the relationship in the other direction, from MyClass to OtherClass.

Many Things can point back to the same Person.

  class Person {
      id_by => 'person_id',
      has => ['name'],
      has_many => [
          things => { is => 'Thing', reverse_as => 'owner' },
      ]
  };
  class Thing {
      id_by => 'thing_id',
      has => [
          owner => { is => 'Person', id_by => 'owner_id' },
      ],
  };

Note that the value for reverse_as needs to be the name of the relation property in the related class that would point back to "me". Yes, it's a bit obtuse, but it's the best we have for now.

Indirect Properties

When the property of a related object has meaning to another object, that relationship can be defined through an indirect property. Things already have owners, but it is also useful to know a Thing's owner's name.

  class Thing {
      id_by => 'thing_id',
      has => [
          owner => { is => 'Person', id_by => 'owner_id' },
          owner_name => { via => 'owner', to => 'name', default_value => 'No one' },
      ],
  };
  $name = $thing->owner_name();
  $name eq $person->name;  # evaluates to true

The values of indirect properties are not stored in the object. When the property's method is called, it looks up the related object through the accessor named in via, and on that result, returns whatever the method named in to returns.

If one of these Thing objects is created by calling Thing->create(), and no value is specified for owner_id, owner or owner_name, then the system will find a Person object where its 'name' is 'No one' and assign the Thing's owner_id to point to that Person. If no matching Person is found, it will first create one with the name 'No one'.

Alias Properties

Sometimes it's useful to have a property that is an alias for another property, perhaps as a refactoring tool or to make the API clearer. The is accomilished by defining an indirect property where the 'via' is __self__.

  class Thing {
      id_by => 'thing_id',
      has => [
          owner => { is => 'Person', id_by => 'owner_id' },
          titleholder => { via => '__self__', to => 'owner' },
      ]
  };

In this case, 'titleholder' is an alias for the 'owner' property. titleholder can be called as a method any place owner is a valid method call. BoolExprs may refer to titleholder, but any such references will be rewrittn to 'owner' when they are normalized.

Subclassing Members of an Abstract Class

In some cases, objects may be loaded using a parent class, but all the objects are binned into some other subclass.

  class Widget {
      has => [
          manufacturer => { is => 'String',
                            valid_values => ['CoolCo','Vectornox'] },
      ],
      is_abstract => 1,
      sub_classification_method_name => 'subclasser',
  };
  sub Widget::subclasser {
      my($class,$pending_object) = @_;
      my $subclass = 'Widget::' . $pending_object->manufacturer;
      return $subclass;
  }

  class Widget::CoolCo {
      is => 'Widget',
      has => 'serial_number',
  };
  class Widget::Vextornox {
      is => 'Widget',
      has => 'series_string',
  }
          
  my $cool_widget = Widget->create(manufacturer => 'CoolCo');
  $cool_widget->isa('Widget::CoolCo'); # evaluates to true
  $cool_widget->serial_number(12345);  # works
  $cool_widget->series_srting();       # dies

In the class definition for the parent class, Widget, it is marked as being an abstract class, and the sub_classification_method_name specifies the name of a method to call whenever a new Widget object is created or loaded. That method is passed the pre-subclassed object and must return the fully qualified subclass name the object really belongs in. All the objects returned to the caller will be blessed into the appropriate subclass.

Alternatively, a property can be designated to hold the fully qualified subclass name.

  class Widget {
      has => [
          subclass_name => { is => 'String',
                             valid_values => ['Widget::CoolCo',
                                              'Widget::Vectornox'] },
      ],
      is_abstract => 1,
      subclassify_by => 'subclass_name',
  }

  my $cool_widget = Widget->create(subclass_name => 'Widget::CoolCo');
  $cool_widget = Widget::CoolCo->create();  # subclass_name is automatically "Widget::CoolCo"

These subclass names will be saved to the data source if the class has a data source. Also, when objects of the base class are retrieved with get(), the results will be automatically put in the appropriate child class.

Inline Data Sources

If the data_source of a class definition is a hashref instead of a simple string, that defines an in-line data source. The only required item in that hashref is is, which declares what class this data source will be created from, such as "UR::DataSource::Oracle" or "UR::DataSource::File". From there, each type of data source will have its own requirements for what is allowed in an inline definition.

For UR::DataSource::RDBMS-derived data sources, it accepts these keys corresponding to the properties of the same name:

  server, user, auth, owner

For UR::DataSource::File data sources:

  server, file_list, column_order, sort_order, skip_first_line,
  delimiter, record_separator

In addition, file is a synonym for server.

For UR::DataSource::FileMux data sources:

  column_order, sort_order, skip_first_line, delimiter, 
  record_separator, required_for_get, constant_values, file_resolver

In addition, resolve_path_with can replace file_resolver and accepts several formats:

subref

A reference to a subroutine. In this case, resolve_path_with is a synonym for file_resolver.

[ $subref, param1, param2, ..., paramn ]

The subref will be called to resolve the path. Its arguments will be taken from the values in the rule from properties mentioned.

[ $format, param1, param2, ..., paramn ]

$format will be interpreted as an sprintf() format. The placeholders in the format will be filled in from the values in the rule from properties mentioned.

Finally, base_path and resolve_path_with can be used together. In this case, resolve_path_with is a listref of property names, base_path is a string specifying the first part of the pathname. The final path is created by joining the base_path and all the property's values together with '/', as in join('/', $base_path, param1, param2, ..., paramn )

SEE ALSO

UR::Object::Type, UR::Object::Property, UR::Manual::Cookbook