The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

PDF::ReportWriter

DESCRIPTION

PDF::ReportWriter is designed to create high-quality business reports, for archiving or printing.

USAGE

The example below is purely as a reference inside this documentation to give you an idea of what goes where. It is not intended as a working example - for a working example, see the demo application package, distributed separately at http://entropy.homelinux.org/axis_not_evil

First we set up the top-level report definition and create a new PDF::ReportWriter object ...

$report = {

  destination        => "/home/dan/my_fantastic_report.pdf",
  paper              => "A4",
  orientation        => "portrait",
  template           => '/home/dan/my_page_template.pdf',
  font_list          => [ "Times" ],
  default_font       => "Times",
  default_font_size  => "10",
  x_margin           => 10 * mm,
  y_margin           => 10 * mm,
  info               => {
                            Author      => "Daniel Kasak",
                            Keywords    => "Fantastic, Amazing, Superb",
                            Subject     => "Stuff",
                            Title       => "My Fantastic Report"
                        }

};

my $pdf = PDF::ReportWriter->new( $report );

Next we define our page setup, with a page header ( we can also put a 'footer' object in here as well )

my $page = {

  header             => [
                                {
                                        percent        => 60,
                                        font_size      => 15,
                                        align          => "left",
                                        text           => "My Fantastic Report"
                                },
                                {
                                        percent        => 40,
                                        align          => "right",
                                        image          => {
                                                                  path          => "/home/dan/fantastic_stuff.png",
                                                                  scale_to_fit  => TRUE
                                                          }
                                }
                         ]

};

Define our fields - which will make up most of the report

my $fields = [

  {
     name               => "Date",                               # 'Date' will appear in field headers
     percent            => 35,                                   # The percentage of X-space the cell will occupy
     align              => "centre",                             # Content will be centred
     colour             => "blue",                               # Text will be blue
     font_size          => 12,                                   # Override the default_font_size with '12' for this cell
     header_colour      => "white"                               # Field headers will be rendered in white
  },
  {
     name               => "Item",
     percent            => 35,
     align              => "centre",
     header_colour      => "white",
  },
  {
     name               => "Appraisal",
     percent            => 30,
     align              => "centre",
     colour_func        => sub { red_if_fantastic(@_); },        # red_if_fantastic() will be called to calculate colour for this cell
     aggregate_function => "count"                               # Items will be counted, and the results stored against this cell
   }
   

];

I've defined a custom colour_func for the 'Appraisal' field, so here's the sub:

sub red_if_fantastic {

     my $data = shift;
     if ( $data eq "Fantastic" ) {
          return "red";
     } else {
          return "black";
     }

}

Define some groups ( or in this case, a single group )

my $groups = [

   {
      name           => "DateGroup",                             # Not particularly important - apart from the special group "GrandTotals"
      data_column    => 0,                                       # Which column to group on ( 'Date' in this case )
      header => [
      {
         percent           => 100,
         align             => "right",
         colour            => "white",
         background        => {                                  # Draw a background for this cell ...
                                   {
                                         shape     => "ellipse", # ... a filled ellipse ...
                                         colour    => "blue"     # ... and make it blue
                                   }
                              }
         text              => "Entries for ?"                    # ? will be replaced by the current group value ( ie the date )
      }
      footer => [
      {
         percent           => 70,
         align             => "right",
         text              => "Total entries for ?"
      },
      {
         percent           => 30,
         align             => "centre",
         aggregate_source  => 2                                  # Take figure from field 2 ( which has the aggregate_function on it )
      }
   }
   

];

We need a data array ...

my $data_array = $dbh->selectall_arrayref( "select Date, Item, Appraisal from Entries order by Date" );

Note that you MUST order the data array, as above, if you want to use grouping. PDF::ReportWriter doesn't do any ordering of data for you.

Now we put everything together ...

my $data = {

   background              => {                                  # Set up a default background for all cells ...
                                  border      => "grey"          # ... a grey border
                              },
   fields                  => $fields,
   groups                  => $groups,
   page                    => $page,
   data_array              => $data_array,
   headings                => {                                  # This is where we set up field header properties ( not a perfect idea, I know )
                                  background  => {
                                                     shape     => "box",
                                                     colour    => "darkgrey"
                                                 }
                              }
   

};

... and finally pass this into PDF::ReportWriter

$pdf->render_data( $data );

At this point, we can do something like assemble a *completely* new $data object, and then run $pdf->render_data( $data ) again, or else we can just finish things off here:

$pdf->save;

CELL DEFINITIONS

PDF::ReportWriter renders all content the same way - in cells. Each cell is defined by a hash. A report definition is basically a collection of cells, arranged at various levels in the report.

Each 'level' to be rendered is defined by an array of cells. ie an array of cells for the data, an array of cells for the group header, and an array of cells for page footers.

Cell spacing is relative. You define a percentage for each cell, and the actual length of the cell is calculated based on the page dimensions ( in the top-level report definition ).

A cell can have the following attributes

name

    The 'name' is used when rendering data headers, which happens whenever a new group or page is started. It's not used for anything else - data must be arranged in the same order as the cells to 'line up' in the right place.

    You can disable rendering of field headers by setting no_field_headers in your data definition ( ie the hash that you pass to the render() method ).

percent

    The width of the cell, as a percentage of the total available width. The actual width will depend on the paper definition ( size and orientation ) and the x_margin in your report_definition.

    In most cases, a collection of cells should add up to 100%. For multi-line 'rows', you can continue defining cells beyond 100% width, and these will spill over onto the next line. See the section on MULTI-LINE ROWS, below.

x

    The x position of the cell, expressed in points, where 1 mm = 72/25.4 points.

y

    The y position of the cell, expressed in points, where 1 mm = 72/25.4 points.

font

    The font to use. In most cases, you would set up a report-wide default_font. Only use this setting to override the default.

font_size

    The font size. Nothing special here...

bold

    A boolean flag to indicate whether you want the text rendered in bold or not.

colour

    No surprises here either.

header_colour

    The colour to use for rendering data headers ( ie field names ).

header_align

    The alignment of the data headers ( ie field names ). Possible values are "left", "right" and "centre" ( or now "center", also ).

text

    The text to display in the cell ( ie if the cell is not rendering data, but static text ).

wrap_text

    Turns on wrapping of text that exceeds the width of the cell.

strip_breaks

    Strips line breaks out of text.

image

    A hash with details of the image to render. See below for details. If you try to use an image type that is not supported by your installed version of PDF::API2, your image is skipped, and a warning is printed out.

colour_func

    A user-defined sub that returns a colour. Your colour_func will be passed:

value

    The current cell value

row

    an array reference containing the current row

options

    a hash containing the current rendering options:

     {
       current_row          - the current row of data
       row_type             - the current row type (data, group_header, ...)
       current_value        - the current value of this cell
       cell                 - the cell definition ( get x position and width from this )
       cell_counter         - position of the current cell in the row ( 0 .. n - 1 )
       cell_y_border        - the bottom of the cell
       cell_full_height     - the height of the cell
       page                 - the current page ( a PDF::API2 page )
       page_no              - the current page number
     }

Note that prior to version 1.4, we only passed the value.

background_func

    A user-defined sub that returns a colour for the cell background. Your background_func will be passed:

value

    The current cell value

row

    an array reference containing the current row

options

    a hash containing the current rendering options:

     {
       current_row          - the current row of data
       row_type             - the current row type (data, group_header, ...)
       current_value        - the current value of this cell
       cell                 - the cell definition ( get x position and width from this )
       cell_counter         - position of the current cell in the row ( 0 .. n - 1 )
       cell_y_border        - the bottom of the cell
       cell_full_height     - the height of the cell
       page                 - the current page ( a PDF::API2 page )
       page_no              - the current page number
     }

custom_render_func

    A user-define sub to replace the built-in text / image rendering functions The sub will receive a hash of options:

     {
       current_row          - the current row of data
       row_type             - the current row type (data, group_header, ...)
       current_value        - the current value of this cell
       cell                 - the cell definition ( get x position and width from this )
       cell_counter         - position of the current cell in the row ( 0 .. n - 1 )
       cell_y_border        - the bottom of the cell
       cell_full_height     - the height of the cell
       page                 - the current page ( a PDF::API2 page )
     }

align

    Possible values are "left", "right", "centre" ( or now "center", also ), and "justified"

aggregate_function

    Possible values are "sum" and "count". Setting this attribute will make PDF::ReportWriter carry out the selected function and store the results ( attached to the cell ) for later use in group footers.

type ( LEGACY )

    Please see the 'format' key, below, for improved numeric / currency formatting.

    This key turns on formatting of data. The possible values currently are 'currency', 'currency:no_fill' and 'thousands_separated'.

    There is also another special value that allows custom formatting of text cells: custom:{classname}. If you define the cell type as, for example, custom:my::formatter::class, the cell text that will be output is the return value of the following (pseudo) code:

            my $formatter_object = my::formatter::class->new();
            $formatter_object->format({
                    cell    => { ... },                 # Cell object "properties"
                    options => { ... },                 # Cell options
                    string  => 'Original cell text',    # Cell actual content to be formatted
            });

    An example of formatter class is the following:

            package formatter::greeter;
            use strict;
    
            sub new {
                    bless \my $self
            }
            sub format {
                    my $self = $_[0];
                    my $args = $_[1];
    
                    return 'Hello, ' . $args->{string};
            }

    This class will greet anything it is specified in its cell. Useful, eh?! :-)

format

    This key is a hash that controls numeric and currency formatting. Possible keys are:

     {
       currency             - a BOOLEAN that causes all value to have a dollar sign prepeneded to them
       decimal_places       - an INT that indicates how many decimal places to round values to
       decimal_fill         - a BOOLEAN that causes all decimal values to be filled to decimal_places places
       separate_thousands   - a BOOLEAN that turns on thousands separating ( ie with commas )
       null_if_zero         - a BOOLEAN that causes zero amounts to render nothing ( NULL )
     }

background

    A hash containing details on how to render the background of the cell. See below.

IMAGES

You can define images in any cell ( data, or group header / footer ). The default behaviour is to render the image at its original size. If the image won't fit horizontally, it is scaled down until it will. Images can be aligned in the same way as other fields, with the 'align' key.

The images hash has the following keys:

path

    The full path to the image to render ( currently only supports png and jpg ). You should either set the path, or set the 'dynamic' flag, below.

dynamic

    A boolean flag to indicate that the full path to the image to use will be in the data array. You should either set a hard-coded image path ( above ), or set this flag on.

scale_to_fit

    A boolean value, indicating whether the image should be scaled to fit the current cell or not. Whether this is set or not, scaling will still occur if the image is too wide for the cell.

height

    You can hard-code a height value if you like. The image will be scaled to the given height value, to the extent that it still fits length-wise in the cell.

buffer

    A *minimum* white-space buffer ( in points ) to wrap the image in. This defaults to 1, which ensures that the image doesn't render over part of the cell borders ( which looks bad ).

BACKGROUNDS

You can define a background for any cell, including normal fields, group header & footers, etc. For data headers ONLY, you must ( currently ) set them up per data set, instead of per field. In this case, you add the background key to the 'headings' hash in the main data hash.

The background hash has the following keys:

shape

    Current options are 'box' or 'ellipse'. 'ellipse' is good for group headers. 'box' is good for data headers or 'normal' cell backgrounds. If you use an 'ellipse', it tends to look better if the text is centred. More shapes are needed. A 'round_box', with nice rounded edges, would be great. Send patches.

colour

    The colour to use to fill the background's shape. Keep in mind with data headers ( the automatic headers that appear at the top of each data set ), that you set the *foreground* colour via the field's 'header_colour' key, as there are ( currently ) no explicit definitions for data headers.

border

    The colour ( if any ) to use to render the cell's border. If this is set, the border will be a rectangle, around the very outside of the cell. You can have a shaped background and a border rendererd in the same cell.

borders

If you have set the border key ( above ), you can also define which borders to render by setting the borders key with the 1st letter(s) of the border to render, from the possible list of:

 l   ( left border )
 r   ( right border )
 t   ( top border )
 b   ( bottom border )
 all ( all borders ) - this is also the default if no 'borders' key is encountered

eg you would set borders = "tlr" to have all borders except the bottom ( b ) border

Upper-case letters will also work.

BARCODES

You can define barcodes in any cell ( data, or group header / footer ). The default barcode type is code128. The available types are code128 and code39.

The barcode hash has the following keys:

type

Type of the barcode, either code128 or code39. Support for other barcode types should be fairly simple, but currently is not there. No default.

x, y

As in text cells.

scale

Defines a zoom scale for barcode, where 1.0 means scale 1:1.

align

Defines the alignment of the barcode object. Should be left (or l), center (or c), or right (or r). This should work as expected either if you specify absolute x,y coordinates or not.

font_size

Defines the font size of the clear text that appears below the bars. If not present, takes report default_font_size property.

font

Defines the font face of the clear text that appears below the bars. If not present, takes report default_font property.

zone

Regulates the height of the barcode lines.

upper_mending_zone, lower_mending_zone

Space below and above barcode bars? I tried experimenting a bit, but didn't properly understand what upper_mending_zone does. lower_mending_zone is the height of the barcode extensions toward the lower end, where clear text is printed. I don't know how to explain these better...

quiet_zone

Empty space around the barcode bars? Try to experiment yourself.

GROUP DEFINITIONS

Grouping is achieved by defining a column in the data array to use as a group value. When a new group value is encountered, a group footer ( if defined ) is rendered, and a new group header ( if defined ) is rendered. At present, the simple group aggregate functions 'count' and 'sum' are supported - see the cell definition section for details on how to chose a column to perform aggregate functions on, and below for how to retrieve the aggregate value in a footer. You can perform one aggregate function on each column in your data array.

As of version 0.9, support has been added for splitting data from a single field ( ie the group value from the data_column above ) into multiple cells. To do this, simply pack your data into the column identified by data_column, and separate the fields with a delimiter. Then in your group definition, set up the cells with the special keys 'delimiter' and 'index' ( see below ) to identify how to delimit the data, and which column to use for the cell once the data is split. Many thanks to Bill Hess for this patch :)

Groups have the following attributes:

name

    The name is used to identify which value to use in rendering aggregate functions ( see aggregate_source, below ). Also, a special name, "GrandTotals" will cause PDF::ReportWriter to fetch *Grand* totals instead of group totals.

page_break

    Set this to TRUE if you want to cause a page break when entering a new group value.

data_column

    The data_column refers to the column ( starting at 0 ) of the data_array that you want to group on.

reprinting_header

    If this is set, the group header will be reprinted on each new page

    These 4 keys set the respective buffers ( ie whitespace ) that separates the group headers / footers from things above ( upper ) and below ( lower ) them. If you don't specify any buffers, default values will be set to emulate legacy behaviour.

    Group headers and footers are defined in a similar way to field definitions ( and rendered by the same code ). The difference is that the cell definition is contained in the 'header' and 'footer' hashes, ie the header and footer hashes resemble a field hash. Consequently, most attributes that work for field cells also work for group cells. Additional attributes in the header and footer hashes are:

aggregate_source ( footers only )

    This is used to indicate which column to retrieve the results of an aggregate_function from ( see cell definition section ).

delimiter ( headers only )

    This optional key is used in conjunction with the 'index' key ( below ) and defines the delimiter character used to separate 'fields' in a single column of data.

index ( headers only )

    This option key is used inconjunction with the 'delimiter' key ( above ), and defines the 'column' inside the delimited data column to use for the current cell.

REPORT DEFINITION

Possible attributes for the report defintion are:

destination

    The path to the destination ( the pdf that you want to create ).

paper

    Supported types are:

       - A4
       - Letter
       - bsize
       - legal

orientation

    portrait or landscape

template

    Path to a single page PDF file to be used as template for new pages of the report. If PDF is multipage, only first page will be extracted and used. All content in PDF template will be included in every page of the final report. Be sure to avoid overlapping PDF template content and report content.

font_list

    An array of font names ( from the corefonts supported by PDF::API2 ) to set up. When you include a font 'family', a range of fonts ( roman, italic, bold, etc ) are created.

default_font

    The name of the font type ( from the above list ) to use as a default ( ie if one isn't set up for a cell ).

default_font_size

    The default font size to use if one isn't set up for a cell. This is no longer required and defaults to 12 if one is not given.

x_margin

    The amount of space ( left and right ) to leave as a margin for the report.

y_margin

    The amount of space ( top and bottom ) to leave as a margin for the report.

DATA DEFINITION

The data definition wraps up most of the previous definitions, apart from the report definition. You can now safely replace the entire data definition after a render() operation, allowing you to define different 'sections' of a report. After replacing the data definition, you simply render() with a new data array.

Attributes for the data definition:

cell_borders

    Whether to render cell borders or not. This is a legacy option - not that there's any pressing need to remove it - but this is a precursor to background->{border} support, which can be defined per-cell. Setting cell_borders in the data definition will cause all data cells to be filled out with: background->{border} set to grey.

upper_buffer / lower_buffer

    These 2 keys set the respective buffers ( ie whitespace ) that separates each row of data from things above ( upper ) and below ( lower ) them. If you don't specify any buffers, default values of zero will be set to emulate legacy behaviour.

no_field_headers

    Set to disable rendering field headers when beginning a new page or group.

fields

    This is your field definition hash, from above.

groups

    This is your group definition hash, from above.

data_array

    This is the data to render. You *MUST* sort the data yourself. If you are grouping by A, then B and you want all data sorted by C, then make sure you sort by A, B, C. We currently don't do *any* sorting of data, as I only intended this module to be used in conjunction with a database server, and database servers are perfect for sorting data :)

page

    This is a hash describing page headers and footers - see below.

PAGE DEFINITION

The page definition is a hash describing page headers and footers. Possible keys are:

header

Each of these keys is an array of cell definitions. Unique to the page *footer* is the ability to define the following special tags:

    %TIME%

    %PAGE%

    %PAGES%

These will be replaced with the relevant data when rendered.

If you don't specify a page footer, one will be supplied for you. This is to provide maximum compatibility with previous versions, which had page footers hard-coded. If you want to supress this behaviour, then set a value for $self->{data}->{page}->{footerless}

MULTI-LINE ROWS

    You can define 'multi-line' rows of cell definitions by simply appending all subsequent lines to the array of cell definitions. When PDF::ReportWriter sees a cell with a percentage that would push the combined percentage beyond 100%, a new-line is assumed.

METHODS

new ( report_definition )

    Object constructor. Pass the report definition in.

render_data ( data_definition )

    Renders the data passed in.

    You can call 'render_data' as many times as you want, with different data and definitions. If you want do call render_data multiple times, though, be aware that you will have to destroy $report->{data}->{field_headers} if you expect new field headers to be automatically generated from your cells ( ie if you don't provide your own field_headers, which is probably normally the case ). Otherwise if you don't destroy $report->{data}->{field_headers} and you don't provide your own, you will get the field headers from the last render_data() operation.

render_report ( xml [, data ] )

    Should be used when dealing with xml format reports. One call to rule them all. The first argument can be either an xml filename or a PDF::ReportWriter::Report object. The 2nd argument is the real data to be used in your report. Example of usage for first case (xml file):

            my $rw = PDF::ReportWriter->new();
            my @data = (
                    [2004, 'Income',               1000.000 ],
                    [2004, 'Expenses',              500.000 ],
                    [2005, 'Income',               5000.000 ],
                    [2005, 'Expenses',              600.000 ],
                    [2006, 'Income (projection)',  9999.000 ],
                    [2006, 'Expenses (projection),  900.000 ],
            );
            $rw->render_report('./account.xml', \@data);
            
            # Save to disk
            $rw->save();
    
            # or get a scalar with all pdf document
            my $pdf_doc = $rw->stringify();

    For an example of xml report file, take a look at examples folder in the PDF::ReportWriter distribution or to PDF::ReportWriter::Examples documentation.

    The alternative form allows for more flexibility. You can pass a PDF::ReportWriter::Report basic object with a report profile already loaded. Example:

            my $rw = PDF::ReportWriter->new();
            my $rp = PDF::ReportWriter::Report->new('./account.xml');
            # ... Assume @data as before ...
            $rw->render_report($rp, \@data);
            $rw->save();

    If you desire the maximum flexibility, you can also pass any object in the world that supports load() and get_data() methods, where load() should return a complete report profile (TO BE CONTINUED), and get_data() should return an arrayref with all actual records that you want your report to include, as returned by DBI's selectall_arrayref() method.

    As with render_data, you can call render_report as many times as you want. The PDF file will grow as necessary. There is only one problem in rendering of header sections when re-calling render_report.

fetch_group_results( { cell => "cell_name", group => "group_name" } )

    This is a convenience function that allows you to retrieve current aggregate values. Pass a hash with the items 'cell' ( the name of the cell with the aggregate function ) and 'group' ( the group level you want results from ). A good place to use this function is in conjunction with a cell's custom_render_func(). For example, you might create a custom_render_func to do some calculations on running totals, and use fetch_group_results() to get access to those running totals.

new_page

    Creates a new page, which in turn calls ->page_template ( see below ).

page_template ( [ path_to_template ] )

    This function creates a new page ( and is in fact called by ->new_page ).< If called with no arguements, it will either use default template, or if there is none, it will simply create a blank page. Alternatively, you can pass it the path to a PDF to use as a template for the new page ( the 1st page of the PDF that you pass will be used ).

save

    Saves the pdf file ( in the location specified in the report definition ).

saveas ( newfile )

    Saves the pdf file in the location specified by newfile string and overrides default report destination property.

stringify

    Returns the pdf document as a scalar.

    Tries to print the report pdf file to a CUPS print queue. For now, it only works with CUPS, though you can supply several options to drive the print job as you like. Allowed options, to be specified as an hash reference, with their default values, are the following:

    command

    The command to be launched to spool the pdf report (/usr/bin/lpr.cups).

    printer

    Name of CUPS printer to print to (no default). If not specified, takes your system default printer.

    tempdir

    Temporary directory where to put the spool file (/tmp).

    If true, deletes the temporary spool file (true).

EXAMPLES

    Check out the examples folder in the main PDF::ReportWriter distribution that contains a simple demonstration of results that can be achieved.

AUTHORS

     Dan <dan@entropy.homelinux.org>
     Cosimo Streppone <cosimo@cpan.org>

BUGS

    I think you must be mistaken.

ISSUES

    In the last release of PDF::ReportWriter, I complained bitterly about printing PDFs from Linux. I am very happy to be able to say that this situation has improved significantly. Using the latest versions of evince and poppler ( v0.5.1 ), I am now getting *perfect* results when printing. If you are having issues printing, I suggest updating to the above.

Other cool things you should know about:

    This module is part of an umbrella project, 'Axis Not Evil', which aims to make Rapid Application Development of database apps using open-source tools a reality. The project includes:

     Gtk2::Ex::DBI                 - forms
     Gtk2::Ex::Datasheet::DBI      - datasheets
     PDF::ReportWriter             - reports

    All the above modules are available via cpan, or for more information, screenshots, etc, see: http://entropy.homelinux.org/axis

Crank ON!

7 POD Errors

The following errors were encountered while parsing the POD:

Around line 2792:

You forgot a '=back' before '=head3'

Around line 2830:

=back without =over

Around line 2838:

You forgot a '=back' before '=head3'

Around line 3068:

You forgot a '=back' before '=head2'

You forgot a '=back' before '=head2'

Around line 3083:

=back without =over

Around line 3085:

=back without =over

Around line 3444:

=back without =over