The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Metadata::DB::Analizer - methods to analize database metadata table entries

SYNOPSIS

   use DBI;
   use Metadata::DB::Analizer;

   my $absdb = '/home/myself/metadata.db';

   my $dbh = DBI->connect("dbi:SQLite:dbname=$abs","","");

   my $a = Metadata::DB::Analizer->new({ DBH => $dbh });

   # we are storing metadata entries for how many things?
   $a->get_records_count;

   # what are all the attribute labels we are storing for all things?
   my $atts = $a->get_attributes;

   for my $att (@$atts) {
      
      # how many different kinds of values does this att have for all things?
      my $count = $a->attribute_all_unique_values_count($att);
      my $att_is_number = $a->attribute_type_is_number($att);

      print "Attribute $att has $count possible value options. The values are all numbers? $att_is_number\n";
   }  

DESCRIPTION

These methods help analize a table in a database about metadata. They are meant to help create an interface to search the results.

Imagine you are storing metadata about people, you have things like first_name, last_name, etc.

This module will help create the interface, by analizing the table data.

For example, if you add 'age' attribute to the table, and there are a finite number of unique values, this code suggests whether to add a drop down select box (in a web interface, for example) or a search text field.

The output generated by this code can be used to generate gui interfaces with html, perltk, etc.

METHODS

new()

   my $dbh = your_database_connection_handle();

   my $a = Metadata::DB::Analizer->new({ DBH => $dbh });

GENERAL INSPECTION METHODS

get_records_count()

returns count of metadata records (each id is one record, athough the metadata table may contain multiple row entries).

FUNCTIONS FOR ALL ATTRIBUTES

To inspect the metadata table's contents.

get_attributes()

returns 'all' of the attributes in the metadata table as array ref.

If you store 'age', 'phone', 'name' in your table, this returns those labels. This is the basis of the idea here, that if you add another attribute, the search interface will automatically offer this as a search option.

This is called internally by search_attributes_selected() if you dont select your own out of the list. this only means what attributes to OFFER the user to search by

get_attributes_by_ratio()

returns array ref of attributes, sorted by occurrences of that value. In the above example, if there are 100 'name' entries and 8 'phone' entries, the name is closer to the front of the list.

get_attributes_ratios()

returns hash ref. keys are the attribute names (vals in mkey)- values are the percentage of occurrence, as compared to all the entries.

get_attributes_counts()

returns hash ref. Each key is an attribute, the values are the number of occurrences.

WHAT SEARCH ATTRIBUTES TO OFFER IN THE SEARCH FORM

When we generate automatic interfaces for searching the metadata.

search_attributes_selected()

returns list of attributes that will be used in the html interface if you do not pass a list, all attribtues are chosen you do not need to specify what kind of selection this is, drop down or text, the data within the databse will figure it out

this is used by generate_search_interface_loop(), in turn used by html_search_form_output().

So, if you wanted to change what shows up..

   my $i = Metadata::DB::Search::InterfaceHTML({ DBH => $dbh });

   $i->search_attributes_selected_clear;
   $i->search_attributes_selected_add('age','height','name','office');

This means if there are fewer then x 'age' possible values, a dropdown box is generated, etc. This is also the order.

If you want to grep out all attributes that match 'path'

   my @attribute_names = sort grep { !/path/ } @{ $self->get_search_attributes }; 
   

now you need to set them as the atts the user can search by..

   $self->search_attributes_selected_clear;
   $self->search_attributes_selected_add( @attribute_names );

if you want to set default limits to all atts matching 'client' to be 1000 instead of the default limit for all atts

   for my $att_name ( grep { /client/ } @attribute_names ){
   
      $self->attribute_option_list_limit( $att_name => 1000 );
   
   }

search_attributes_selected_clear()

take out all search attributes to 0

search_attributes_selected_count()

search_attributes_selected_add()

arg is attribute name

search_attributes_selected_delete()

arg is attribute name. will take out of list, when generating, will not show up.

METHODS FOR ONE ATTRIBUTE

Once you know your attribute label/name.

attribute_option_list()

argument is name of attribute, optional arg is a limit number the default limit is 100. returns a list suitable for a select box, or returns undef if the unique values found for this attribute exceed the limit.

For example, if you have an attribute called hair_color, you can have blonde, brunette, redhead etc. You would want to offer this as a select box. Thus, if you have blondes and brunnettes as values for the attribute 'hair_color' in the metadata table..

   my $options = generate_attribute_option_list($dbh,'hair_color');

   # $options = [qw( blonde redhead brunette )];

Note that if your metadata table does not have any entries such as

   id mkey        mval
   1  hair_color  auburn

Then the hair color auburn will not appear in the array ref returned. Furthermore if there are more then 15 variations of hair color, undef is returned. If you want to allow for more variations...

For example, if you want to list every single 'first_name' attribute as an option, regardless of how many there are..

   my $options = generate_attribute_option_list($dbh,'first_name',1000000);

Remember that the return values depend on what the database table holds!

attribute_option_list_limit()

returns defatult limit set. by default this is 100 if an attribute to be selected from has more then this count, it is offered as a field, if it has less, it is a drop down box. this can be overridden on a per attribtue basis also, this is just the main default

you can also set limits on a per attribute basis, to do so..

   my $limit = 
      $self->attribute_option_list_limt( $attribute_name => $number );

   my $limit_retrieved = 
      $self->attribute_option_list_limit( $attribute_name );

   my $default_limit_for_all_not_specifically_set =
      $self->attribute_option_list_limit;
      

this is a perl set get method, always returns the value

attribute_all_unique_values()

argument is dbh and atrribtue name. optional arg is limit (default is 15) returns array ref.

if you provide the limit, and it is reached (more then 'limit' unique value occurrences) then it returns undef.

attribute_type_is_number()

argument is the attribute name analizes the possible values and determines if they are all numbers returns boolean

this is useful if you want to offer 'less than' option in a select box, for example

attribute_type_is_boolean()

TODO: NOT IMPLEMENTED

CAVEATS

These are meant to be used offline, they can use up cpu like mad. Consider caching the values with Cache::File

DATABASE LAYOUT

The default metadata table columns are

   id | mkey | mval

The default table name is 'metadata'.

If for some reason you have a metadata table with somewhat of a different layout, you can change the names of the columns and table via:

   $object_instance->table_metadata_name('metadata_cars');
   $object_instance->table_metadata_column_name_key('attribute');

For more information of the database layout, please see Metadata::DB

SEE ALSO

Metadata::DB::Search Metadata::DB::Indexer Metadata::DB::Search::InterfaceHTML Metadata::DB::WUI Metadata::DB Metadata::Base

AUTHOR

Leo Charre leocharre at cpan dot org