The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

LRpt::XMLReport - A module for converting outputs generated by LReport tools to XML format.

SYNOPSIS

    lrptxml.pl --selects=selects.txt --keys_file=keys.txt data 
  
    lrptxml.pl --diffs --selects=selects.txt --keys_file=rkeys.txt diffs.txt 

DESCRIPTION

LRpt::XMLReport is a part of LRpt (LReport) library. It used to convert outputs generated by other tools from LReport suite to XML format. You should not call this class directly from your code. Instead you should use lrptxml.pl tool, which is a simple wrapper around this module. It looks like this:

  use strict;
  use LRpt::XMLReport;
  
  create_report( @ARGV );
  

COMMAND LINE SWITCHES

--selects=file

Optional. Name of a selects file (see doc). They are included in the report for a select of corresponding name. If selects are not given, report will not contain selects.

--diffs

Switches working mode to read output of lcsvdiff.pl. If not given the working mode is reading csv files from a directory

--keys_file=keys_file

File name containing row key definitions.

HOW IS THIS MANUAL ORGANIZED

On this page you will find details on LRpt::XMLReport package. You also find details on how to use LReport tools for reporting.

Firstly you will find a list all methods provided by the class.

Then I will show you a step by step example, in which I will explain you what can be done and how it can be done.

Then I will give some additional details, which I did not include in the example to avoid making it too complex.

On the end I will describe existing XSLT stylesheets for converting XML reports to other formats. Currently only conversion to RTF is available.

METHODS

In this sections you will find a more or less complete listing of all methods provided by the package.

create_report

  create_report( @ARGV );

Main public function. The only one exported. See "COMMAND LINE SWITCHES" for meaning of parameters.

load_diffs

  load_diffs();

Loads input data from files generated by LRpt::CSVDiff module.

load_one_select

  load_one_select();

Loads one select from the input diff file.

load_ROW_entry

  load_ROW_entry( $line, $file );

Loads one ROW type entry from the diff file.

load_DEL_entry

  load_DEL_entry( $line, $file );

Loads one DEL type entry from the diff file.

load_INS_entry

  load_INS_entry( $line, $file );

Loads one INS type entry from the diff file.

load_csvs

  load_csvs();

Loads data from csv files

load_csv_file

  load_csv_file();

Loads data from one csv file

include_in_report

  include_in_report( $name, $row, $type );

Adds a row to a list, which will be used for report generation

generate_report

  generate_report();

Generate XML report.

process_data

  process_data( $select );

Creates report for one select

process_row

  process_row( $select, $item );

Creates entry for one row

  print_report();

Prints report to the standard output.

  print_select();

Formats one select for XML

  print_header();

Formats one header for XML

  print_data();

Formats one data section for XML

  print_row();

Formats one row for XML

  print_value();

Formats one value for XML

  print_different_row();

Formats one different row for XML

  print_statement();

Adds details of select statement to the report

  print_usage();

Prints usage text.

REPORTING AND FORMATING IN LREPORT

This chapter is supposed to help you to understand how LReport helps you in generating nicely formated reports of database contents and differences.

We will start by an example

EXAMPLE 1

We will continue using an example from manual page for LRpt. Our goal would be to generate an RTF document, which looks like the one given at http:/xxxx.

As you remember, we finished with the output of lcsvdiff.pm

We will create an XML document and then we will process it with XSLT processor to get a final RTF document.

One of possible inputs for lrptxml.pm is output generated by lcsvdiff.pm. However, this is not exactly the format generated in the example shown in LRpt manual page. The format used there, contained only information about changes. In order to generate a full report we will have to use --all switch when calling lcsvdiff.pm. This switch will generate the following output:

  lcsvdiff.pl before/customer.txt after/customer.txt 
  SCHEMA: customer_id name  last_name  address
  ROW( 1234 ): 1234  Jan Nowak   Warszawa
  lcsvdiff.pl before/service.txt after/service.txt 
  SCHEMA: customer_id service_type price status
  INS( 1234#GPRS ): 1234  GPRS 2.05  ACTIVE
  DEL( 1234#MAIL ): 1234  MAIL 1.30  ACTIVE
  ROW( 1234#VOICE ): 1234 VOICE   0.34  DEACTIVATED
  UPD( 1234#VOICE ): status: ACTIVE =#=> DEACTIVATED

As you can see, differences are reported together with information about all rows returned by selects.

You run lrptxml.pl in the following way

  lrptxml.pl --diffs --selects=selects.txt --rkeys_file=rkeys.txt diffs.txt

Meaning of parameters:

--diffs

A switch indicating that a XML report should be generated from output of lcsvdiff.pl tool.

--selects

A file containing selects. This is the same file as used by lcsvdmp.pl. It is used here to include in XML report the full text of select, which returned presented rows

--rkeys_file

File with a definiton of row keys. This is exactly the same type of file as used by lcsvdiff.pl tool. It is used here to order rows in the report properly.

diffs.txt

Output generated by lcsvdiff.pl ran with --all switch.

The output of lrptxml.pl should look like this:

  <report>
      <customer>
          <statement><![CDATA[select * from customer where customer_id = 1234]]>
          </statement>
          <header>
              <customer_id/>
              <name/>
              <last_name/>
              <address/>
          </header>
          <data>
              <equal>
                  <customer_id>1234</customer_id>
                  <name>Jan</name>
                  <last_name>Nowak</last_name>
                  <address>Warszawa</address>
              </equal>
          </data>
      </customer>
      <service>
          <statement><![CDATA[select * from service where customer_id = 1234]]>
          </statement>
          <header>
              <customer_id/>
              <service_type/>
              <price/>
              <status/>
          </header>
          <data>
              <additional>
                  <customer_id>1234</customer_id>
                  <service_type>GPRS</service_type>
                  <price>2.05</price>
                  <status>ACTIVE</status>
              </additional>
              <missing>
                  <customer_id>1234</customer_id>
                  <service_type>MAIL</service_type>
                  <price>1.30</price>
                  <status>ACTIVE</status>
              </missing>
              <different>
                  <customer_id>1234</customer_id>
                  <service_type>VOICE</service_type>
                  <price>0.34</price>
                  <status>
                      <old_value>ACTIVE</old_value>
                      <new_value>DEACTIVATED</new_value>
                  </status>
              <different>
          </data>
      </service>
  </report>

Let's have a closer look at it.

The report tag is a root element. Each of its children is a select. The example above has 2 selects: customer and service. Each select element has 3 children:

statement

Contains a full text of a select, which returned given rows.

Contains all columns' names returned by the select

data

Contains all rows (with all columns) returned by the select.

Direct children on data element are:

equal

No differences found.

additional

The row was not present in the before collection but is present in an after collection.

missing

The row was not present in the after collection but is present in a before collection.

different

The row is present in the after and before collection. There are some differences found in columns' values.

You can read about XML report format in more details later in xxx.

Now you can get XSLT stylesheets for RTF from source forge and process the generated report. There is some fuzziness around XSLT specification so I can't guarantee that it will work for any XSLT processor. But it worked for me with Saxon.

After getting RTF stylesheet and installing Saxon, you can run the conversion. On my Windows machine it look like this:

  java net.sf.saxon.Transform report.xml rld_rtf.xsl > report.rtf

WORKING MODES

lrptxml.pl can work is 2 modes - csv and diffs.

In csv mode (default one) lrptxmp.pl will pick all files with expected extension found is this directory. Since there would be no comparison information, all rows will be reported as equal.

Second mode is diffs mode, when lrptxml.pl accepts on input a file produced by lcsvdiff.pl tool ran with --all switch. In this mode differences are reflected in the output report. In order to switch to diffs mode, --diffs switch has to be used.

The default mode is reading csv files.

XML REPORT FORMAT

XML report has to be well formed. Its root element is report. Its direct children are elements dedicated to results of single selects. Name of each of this elements is a name of a select.

The general structure of an XML report is given below:

  <report>
      <select_name1>
          <statement>select * from ..... </statement>
          <header>
              <col1/>
              <col2/>
              <col3/>
              <colN/>
          </header>
          <data>
              <equal>
                 <col1>value11</col1>
                 <col2>value12</col2>
                 <col3>value13</col3>
                 <colN>value14</colN>
              </equal>
              <different>
                 <col1>value21</col1>
                 <col2>value22</col2>
                 <col3>
                     <old_value>old_value23</old_value>
                     <new_value>new_value23</new_value>
                 </col3>
                 <colN>value24</colN>
              </different>
              <missing>
                 <col1>value31</col1>
                 <col2>value32</col2>
                 <col3>value33</col3>
                 <colN>value34</colN>
              </missing>
              <additional>
                 <col1>value41</col1>
                 <col2>value42</col2>
                 <col3>value43</col3>
                 <colN>value44</colN>
              </additional>
          </data>
      </select_name1>
      <select_name2>
      ........
      </select_nameN>
  </report>

Each select is reported in an element, which name is equal to select name. It contains the following children: statement, header and data.

statement

Contains a full text of select which produced reported rows.

header

Contains elements for all columns in a row. Order of elements is similar to order of columns in result returned by a select.

data

Contains actual rows. It may have 4 types of child elements:

equal

No differences reported for a given row. Child elements are columns, their value are values from columns.

different

Row existed in after and before collections. Values in some columns were different. Child elements are columns, their values are values from columns for all columns except columns where differences were found. For those columns, column element contains to child elements: old_value and new_value which contains values from before and after rows.

additional

before row does not exist, after row does. Child elements are columns, their value are values from columns.

missing

before row exists, after row doesn't. Child elements are columns, their values are values from columns.

XSLT STYLESHEET

Currently there is only one stylesheet available, for conversion to RTF documents.

RTF converter.

RTF converter create an RTF document. An example of such a document can be found at xxx. Section containing results of one select consists of select statement text and then rows. Rows are not presented in the tables. Instead each row is one paragraph, with fields separated by tabs. This is good for presenting multi row selects. Using tables would make columns very narrow and hardly readable. Header, containg all columns to be reported is printed in bold on a gray background. Every second row is shaded, so it's easier to see which values belongs to which row.

Additional rows (see before-after comparison) are printed using italics, bold and double underline.

Missing rows are (see before-after comparison) are printed using stroke through

Different rows (see before-after comparison) are presented in standard font. Only columns that differ are printed in bold, italics and underline.

We can choose to highlight some columns with some colors. It is helpful when viewing reports with many columns. Highlighting can be used to point readers attention to most important data.

FILES

The stylesheet is composed of 3 files:

rtf.xsl

Contains the main processing logic. Not to be modified (unless you know what you are doing)

rtf_header.xsl

Some headers and constants related to RTF format. Not to be modified.

rldf_rtf.xsl

To be edited by you. Contains details of formating in RTF documents. rldf stands for Report Layout Definition File.

An example of rldf is given below:

It contains some standard stuff related to XSLT. The only part, which should be modified by you is contents of rld variable element. You can read details on rld in next chapter.

RLD

rld is an XML tree, used to specify details on how are select results shown in RTF document. It allows to specify, which columns should be on the report and with what color it should be highlighted.

Its structure is as follows:

  <rld>
      <select_name1 all="1" >
          <col1 color="red" />
      </select_name1>
      <select_name2>
          <col1 color="blue" />
          <col2 color="yellow" />
          <col3 color="yellow" />
      </select_name2>
  </rld>

Element named as a select name defines details for a given select. This element has one attribute: all. If this attributes value is 1 than all columns from the select are shown in RTF document.

Each select element has children elements for columns. Column element may have one attribute, color, which defines color of column's highlighting. Possible values are xxxx. If all attribute is not set to 1 then only those columns are present in RTF document, for which column elements are present in rld. If all attribute is set to 1 then all columns are present in RTF document. If column child elements are present, their color attribute are used for defining color of column highlighting.

TO DO

More stylesheets from other formats should be available. I am not planning to do any in the nearest future.

The current RTF stylesheet is poorly documented, and possibly poorly structured. I am not an XSLT champion.

SEE ALSO

The project is maintained on Source Forge http://lreport.sourceforge.net. You can find there links to some helpful documentation like tutorial.

AUTHORS

Piotr Kaluski <pkaluski@piotrkaluski.com>

COPYRIGHT

Copyright (c) 2004-2006 Piotr Kaluski. Poland. All rights reserved.

You may distribute under the terms of either the GNU General Public License or the Artistic License, as specified in the Perl README file.