The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

FastDB::Load - Load data to FastDB database

SYNOPSIS

        use FastDB::Load;

        my $obj = Load->new(
        'Name'                     => 'Database name',
        'Data store location'      => 'Directory to store the data',
        'Field separator string'   => '|'  ,
        'Original field names'     => [ Columns         ],
        'Indexed fields'           => [ Indexed columns ],
        'Extra virtual fields'     => [ Extra   columns ],
        'Transform'                => [
                                      'Some column 1' => 'my $a = lc DATA; $a = reverse $a; return $a;',
                                      'Some column 2' => 'lc DATA'                ,
                                      'Some column 3' => 'uc DATA;'               ,
                                      ]
        );

        $load->load( 'a1' , 'a2', 'a3', ...  );
        $load->load( 'b1' , 'b2', 'c3', ...  );
        ...
        $obj->Write_statistics_at_the_end();

EXAMPLE

        #!/usr/bin/perl
        #
        # Example of FastDB::Load . Extra virtual fields you can optional use:
        #
        #       EXTRA_DAY          01  - 31
        #       EXTRA_DAY_NAME     Sun - Sat
        #       EXTRA_MONTH_NAME   Jan - Dec
        #       EXTRA_MONTH        01  - 12
        #       EXTRA_YEAR         1453
        #       EXTRA_HOUR         00  - 23
        #       EXTRA_MINUTE       00  - 59
        #       EXTRA_SECOND       00  - 59
        #       EXTRA_TIMESTAMP    20101201123957  ( YYYYMMDDhhmmss )

        use FastDB::Load;
        my $load = Load->new(

        'Name'                     => 'Export cargo'                                           ,
        'Data store location'      => '/work/FastDB test/db'                                   ,
        'Field separator string'   => ','                                                      ,
        'Original field names'     => [ 'COLOR', 'HEIGHT', 'WEIGHT', 'TYPE', 'ID', 'COUNTRY' ] ,
        'Indexed fields'           => [ 'WEIGHT' , 'EXTRA_YEAR' ]                              ,
        'Extra virtual fields'     => [ 'EXTRA_TIMESTAMP', 'EXTRA_YEAR', 'EXTRA_DAY_NAME' ]    ,
        'Transform'                => [
                                      'COLOR'   => 'my $a = lc DATA; $a'  ,
                                      'TYPE'    => 'uc DATA;'             ,
                                      'ID'      =>  '"<id>DATA</id>"'     ,
                                      'COUNTRY' => 'uc DATA'              ,
                                      ]
        );

        $load->load( 'Green' , 10, 1500, 'mech22', 'A100', 'New Zeland'   );
        $load->load( 'Brown' , 10, 1500, 'mech22', 'A100', 'India'        );
        $load->load( 'Green' , 11, 3500, 'mech23', 'B100', 'Australia'    );
        $load->load( 'Yellow',  7, 2500, 'mech21', 'C100', 'South Africa' );
        $load->load( 'Red'   , 14, 2500, 'mech21', 'D001', 'U.S. Montana' );
        $load->load( 'Red'   , 17, 5500, 'mech32', 'D101', 'U.S. Montana' );
        $load->load( 'White' , 21,  700, 'snow02', 'E002', 'North Pole'   );
        $load->load( 'White' , 21,  700, 'snow02', 'E002', 'South Pole'   );

        # Optional write some short information about your load
        #       $load->Write_statistics_at_the_end();
        #       $load->Write_statistics_at_the_end( $SomeFile );
        #       $load->Write_statistics_at_the_end( "$load->{'Data store location'}/$load->{'Name'}.log" );
        
        $load->Write_statistics_at_the_end();

DESCRIPTION

FastDB is a file based database. It is using directories to store the indexed columns. Also there is implemented deduplication to avoid storing the same data where it is possible. Your database and its schema will be created at first data load. After the first data load it is not possible to add or remove columns.

It is written at Pure perl, so it can run on all operating systems. It is designed to give answers as fast your disk and operating system is.

This module load your data to a FastDB database. At loading time you can edit your data of every column using generic Perl code defined at the property 'Transform'. Its column should have its own code. You can have transform to one or more columns. The special string DATA (or data) is replaced with the currect column value, at loading time 'Transform' is optional, do not use it if you do not want.

At the end of loading it is suggested to call the optional function 'Write_statistics_at_the_end' to write some short info to a file.

Functions

my $load = Load->new( %hash );

Creates a new FastDB::Load object. %hash must have the keys

Data store location

The root directory that will hold your data

Name

The name of your database. This will also become a subdirectory of the Data store location

Field separator string

This is used internal to separated columns from each other. Can be more than one characters. You must select a string that there is no case to be found at your data

Extra virtual fields

At loading time you can optional load the following fields that do not exists at your data . Their values calculated at loading time. The values may change if your load continue for long time. The name of these fields and some sample values are

        EXTRA_DAY          01  - 31
        EXTRA_DAY_NAME     Sun - Sat
        EXTRA_MONTH_NAME   Jan - Dec
        EXTRA_MONTH        01  - 12
        EXTRA_YEAR         2012
        EXTRA_HOUR         00  - 23
        EXTRA_MINUTE       00  - 59
        EXTRA_SECOND       00  - 59
        EXTRA_TIMESTAMP    20121201123957  ( YYYYMMDDhhmmss )

Original field names

An array reference of your column names. Do not include here again the Extra virtual fields The case is important. Field names must not contain the character |

Indexed fields

An array reference of the columns you want to index. You define any any original or extra field. Do not define more than you really need. These will become subdirectories.

Transform

An array reference with the data transformations . You can use this, to transfrom your data at loading time. You define the column name and some Perl code. Perl code is applied over column data, and FastDB is storing its returned value. The special string DATA is replaced at loading time with the current value. Every Transformation is applied only to its column. You can not use column names inside the Perl code. The order of 'Transform' is not important. Its syntax is

        'SOME COLUMN 1' => 'Perl code do something with the "DATA"',
        'SOME COLUMN 2' => 'ucfirst DATA',
        'SOME COLUMN 3' => 'my $var = DATA ; blah blah blah ; $var',
        and so on
$load->load( col1, col2, ... );

A list of data you want to store as a row. The fields order should be the same as the column names at Original field names

normally you will put this inside a loop that read and split lines from a file, socket or whatever.

$load->Write_statistics_at_the_end( [SomefFile] );

Optional method. Writes to a file how many rows loaded and long it took. It takes as optional argument the file to write this info to. If you do not specify an file it will use the string "$load->{'Data store location'}/$load->{'Name'}.log"

        $load->Write_statistics_at_the_end();
        $load->Write_statistics_at_the_end( $SomeFile );
        $load->Write_statistics_at_the_end( "$load->{'Data store location'}/$load->{'Name'}.log" );

NOTES

There is a case to have problem at microsoft windows when you have multiple indexes with long values because of the 255 characters NTFS max path limitation.

It is recommented to use a linux partition (or a mounted file) formatted with btrfs file system ( ext4 is also good but not as fast as btrfs). Ext3, Fat16 are not recommended.

INSTALL

Because this module is implemented with pure Perl it is enough to copy FastDB directory somewhere at your @INC or where your script is. For your convenient you can use the following commands to install/uninstall the module

        Install:     setup_module.pl –-install   --module=FastDB

        Uninstall:   setup_module.pl –-uninstall --module=FastDB

AUTHORS

Author: gravitalsun@hotmail.com (George Mpouras)

COPYRIGHT

Copyright (c) 2011, George Mpouras, gravitalsun@hotmail.com All rights reserved.

This program is free software; you may redistribute it and/or modify it under the same terms as Perl itself.

1 POD Error

The following errors were encountered while parsing the POD:

Around line 536:

Non-ASCII character seen before =encoding in '–-install'. Assuming CP1252