The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

LRpt::CollDiff - A module for comparing 2 collections of rows

DESCRIPTION

This class is a part of LRpt library. Object of this class compares 2 collections (LRpt::Collection objects). Differences are stored inside the object and can be querried by object's methods. They can also be printed on the standard output.

This manual page focus on implementation details. The intended audience are developers who are willing to modify LReport code or what to use some of its modules in their code. If you are new to LReport please have a look at LRpt manual page for introduction.

GLOSSARY

The following terms are used in the code:

before collection
after collection

"before" refers to one of compared collections. Second collection is called "after". The reason for such a naming instead of col1 and col2 is that the code was written for comparing set of rows before the modification with rows after the modification. Although the code will not break if you give collections in the opposite order, it would be much easier for you to work with this code if you follow this convention

missing/additional/not equal

Type of possible differences. If the row is "missing", it means that there exists "before" row and there is no "after" row with the same key. "additional" means that "after" row exists and respective "before" row does not. "not equal" means that after and before rows exist but there are some differences between them.

METHODS

In this sections you will find a more or less complete listing of all methods provided by the package. Note that the package itself is not public so none of those methods are guaranteed to be maintained in future (including the package itself).

You may consider helpful having a look at the INTERNAL DATA STRUCTURE of the object. This section may help you to understand meaning of some methods.

new

  my $cdiff = LRpt::CollDiff->new( 'before' => $b_coll,
                                   'after'  => $a_coll );

Constructor. Initializes fields used in later processing. The before and after parameters should be references to collections to be compared (LRpt::Collection objects).

compare_collections

  $cdiff->compare_collections();

The main function. Compares collections given in a constructor. All found differences are stored internally and may be retrieved by calling other functions (TODO - list those functions).

compare_columns

  $cdiff->compare_columns( $key_value );

Checks if rows from 'before' and 'after' collections, having the same key value, have the same columns. If this is not the case, a difference of type 'diff_columns' is added.

compare_rows

  $cdiff->compare_rows( $key_name );

Compares 'before' and 'after' rows having the same key value and report differences between them. Finds and stores differences of 'diff_not_equal' type.

add_diff_missing

  $cdiff->add_diff_missing( $key_value );

Stores information that a row with the key $key_value is 'missing'.

report_diff_missing

  $cdiff->report_diff_missing( $key_value );

Stores information that a row with the key $key_value is 'missing'.

  $cdiff->print_all();

Returns 1 if the full report should be printed (differences and all rows).

add_diff_additional

  $cdiff->add_diff_additional( $key_value );

Stores information that a row with the key $key_value is 'additional'.

report_diff_additional

  $cdiff->report_diff_additional( $key_value );

Stores information that a row with the key $key_value is 'additional'.

add_diff_not_equal

  $cdiff->add_diff_not_equal( $key_value,
                              'column' => $column_name,
                              'before' => $before_value,
                              'after'  => $after_value );

Stores information that a row with the key $key_value is 'not_equal'.

report_diff_not_equal

  $cdiff->report_diff_not_equal( $key_value,
                              'column' => $column_name,
                              'before' => $before_value,
                              'after'  => $after_value );

Stores information that a row with the key $key_value is 'not_equal'.

add_diff_columns

  $cdiff->add_diff_columns( $key_value,
                              'column' => $column_name,
                              'before' => $before_value,
                              'after'  => $after_value );

Stores information that compared rows do not have the same sets of columns.

set_before_col_val

  $cdiff->set_before_col_val( $key_value,
                              $column_name,
                              $value );

Stores information about value in 'before' row, which is different from value in 'after' row.

set_after_col_val

  $cdiff->set_after_col_val( $key_value,
                             $column_name,
                             $value );

Stores information about value in 'after' row, which is different from value in 'before' row.

get_before_row

  $cdiff->get_before_row( $key_value );

Gets before row with the given key value.

get_after_row

  $cdiff->get_after_row( $key_value );

Gets after row with the given key value.

get_diff_type

  $cdiff->get_diff_type( $key_value );

Gets type of the difference between 'before' and 'after' rows of a given key value.

get_diff_columns

  $cdiff->get_diff_columns( $key_value );

Gets all columns, which values are different for 'after' and 'before' rows of a given key value. Columns are ordered by their order in the database.

get_key_columns

  $cdiff->get_key_columns();

Get names of columns, which are a part of a key.

has_any_diff

  my $result = $cdiff->has_any_diff();

Returns 1 if there are any differences between compared collections (LRpt::Collection). 0 otherwise.

has_diff

  my $result = $cdiff->has_diff( $diff_type, $key_value, $col_name );

Returns 1 if there is a difference on a given column of a given type for rows with a given key. 0 otherwise.

get_not_equal_keys

  my @key_values = $cdiff->get_not_equal_keys();

Get keys of all rows which have 'not_equal' type of difference.

get_missing_keys

  my @key_values = $cdiff->get_missing_keys();

Get keys of all rows which have 'missing' type of difference.

get_additional_keys

  my @key_values = $cdiff->get_additional_keys();

Get keys of all rows which have 'additional' type of difference.

is_missing

  my $result = $cdiff->is_missing( $key_value );

Returns 1 if a row of a given key is 'missing'.

is_additional

  my $result = $cdiff->is_additional( $key_value );

Returns 1 if a row of a given key is 'additional'.

get_missing_row

  my $row = $cdiff->get_missing_row( $key_value );

Returns a 'missing' row for a given key value. If there is no such 'missing' row, method dies.

get_additional_row

  my $row = $cdiff->get_additional_row( $key_value );

Returns an 'additional' row for a given key value. If there is no such 'additional' row, the method dies.

get_additional_rows

  my @rows = $cdiff->get_additional_rows();

Returns all 'additional' rows.

remove_diff_additional

  $cdiff->remove_diff_additional( $row );

Removes from internal structures an information that a row reffered to by $row is an 'additional' row.

has_missing_diffs

  my $result = $cdiff->has_missing_diffs();

Returns 1 if there is any difference of type 'missing'. 0 otherwise.

has_additional_diffs

  my $result = $cdiff->has_additional_diffs();

Returns 1 if there is any difference of type 'additional'. 0 otherwise.

has_not_equal_diffs

  my $result = $cdiff->has_not_equal_diffs();

Returns 1 if there is any difference of type 'not_equal'. 0 otherwise.

get_before_fname

  my $result = $cdiff->get_before_fname();

Returns file name which was a source of data for before collection.

get_after_fname

  my $result = $cdiff->get_after_fname();

Returns file name which was a source of data for after collection.

INTERNAL DATA STRUCTURE

Below you can find something, which is my pathetic attempt to create a readable diagram. It is supposed to show a hierarchy of internal data structure. I am open to suggestions how to make it look more readable

  $self->
      |
      |->{ before } = LRpt::Collection object
      |->{ after }  = LRpt::Collection object
      |->{ report_header } = $header
      |->{ print_what }   = $type_of_output
      |->{ differences }
      |   |->{ $key_value1 }
      |   |   |->{ diff_type } = "missing"
      |   |   |->{ row } = $row1
      |   |......................
      |   |->{ $key_valueX }
      |   |   |->{ diff_type } = "additional"
      |   |   |->{ row } = $rowX
      |   |......................
      |   |->{ $key_valueY }
      |   |   |->{ diff_type } = "not_equal"
      |   |   |->{ diffs }
      |   |   |   |->{ $col1 }
      |   |   |   |   |->{ before } = $value_before1
      |   |   |   |   |->{ after } = $value_after1
      |   |   |   |.................................
      |   |   |   |->{ $colK }
      |   |   |   |   |->{ before } = $value_beforeK
      |   |   |   |   |->{ after } = $value_afterK
      |   |   |->{ before_row } = $before_row
      |   |   |->{ after_row } = $after_row

Meaning of data members:

before

Reference to a before LRpt::Collection object

after

Reference to a after LRpt::Collection object

key

Name of a row key used in both LRpt::Collection objects for row retrieval. It is expected that the key is defined in both collections and it has the same definition in both. If it doesn't then, to be honest, I don't know what is going to happen.

report_header

Text to be printed on standard output when comparison of collections starts

Mode of printing

differences

Reference to a data structure containing information about all found differences. If after and before rows are identical, no information is stored here. If they both exist, but there are some differences in values in columns, or one of them does not exist, then a new element is created.

If both rows exists but they are different, then a hash key is added. The value of the key is row key of compared rows. The hash value, pointed by a key is references to a new hash. The hash contains four elements:

* diff_type - Type of the difference. In that case it is not_equal

* diffs - A reference to a new hash. Keys of this new hash are names of columns, which are different in before and after rows. Values are references to hashes keeping before and after value of a column.

* before_row - Reference to a before row.

* after_row - Reference to an after row.

If after row does not exists, then a key value is a row key of the before row. The diff_type is missing. row element points to before row.

If before row does not exists, then a key value is a row key of the after row. The diff_type is additional. row element points to after row.

SEE ALSO

The project is maintained on Source Forge http://lreport.sourceforge.net. You can find there links to some helpful documentation like tutorial.

AUTHORS

Piotr Kaluski <pkaluski@piotrkaluski.com>

COPYRIGHT

Copyright (c) 2004-2006 Piotr Kaluski. Poland. All rights reserved.

You may distribute under the terms of either the GNU General Public License or the Artistic License, as specified in the Perl README file.