Γιώργος Μπούρας > FastDB > FastDB::Question

Download:
FastDB-2.0.2.tar.gz

Annotate this POD

CPAN RT

New  1
Open  0
View/Report Bugs
Source  

NAME ^

FastDB::Question - Ask questions to FastDB database and get back the answers

SYNOPSIS ^

        use FastDB::Question;
        
        my $obj    = Load->new( $SchemaFile );
        my $answer = $qry->question( %hash );

        foreach my $row (@{$answer}) { print "Columns : @{$row}\n" }

EXAMPLE ^

        use FastDB::Question;

        my $qry    = Question->new( '/work/FastDB test/db/Export cargo.schema' );
        my $answer = $qry->question(
        
        'Fields to return' => [ 'TYPE', 'WEIGHT', 'COUNTRY', 'COLOR' ],
        'Filters'          =>
                              [
                              'WEIGHT' => '( DATA > 800 ) and ( DATA < 8000 )'  ,
                              'COLOR'  => 'dATA =~/(green|brown|red)/i'         ,
                              ],
        'Conditions'       =>
                              [
                              '(WEIGHT >= 1500) and TYPE=~/2/',
                              '((COLOR eq "blue") or (COLOR eq "red")) and (COUNTRY eq "France")'
                              ],
        'Results'          =>
                              [
                              'Return an array of arrays'             => 'Yes'           ,
                              'Print to standard output'              => 'No '           ,
                              'Print to standard error'               => 'no'            ,
                              'Print to file'                         => 'Yes'           ,
                              'File name'                             => '/tmp/OUTPUT.TXT',
                              'Pass to external Perl module'          => 'Yes'           ,
                              'Perl module name'                      => 'MIME::Base64'  ,
                              'Function of the Perl module'           => 'encode_base64' ,
                              'Code of how to pass data at function'  => 'join ",", @_'  ,
                              ]
        );

        print "No data or 'Return an array of arrays' is set to No\n" if 0 == scalar @{$answer};
        foreach my $row (@{$answer}) {
        print "columns : @{$row}\n"
        }

DESCRIPTION ^

Ask questions to FastDB and to something with the found data. It is designed to provide unparallel speed. With correct indexes and questions you can get your data back much faster than other big database vendors. You have two methods to select your data. Filters and Conditions. Filters are always faster than conditions. Conditions are applied to all columns, while every Filter is applied only to the column it is assigned to. It is also fast if you combine both Filters and Conditions. Good practice is to index the columns you are usually applying Filters.

Do not use conditions if you can do the same selection with Filters only.

You have several options of what to do the found data. For example you can pass them to another external perl module, or you can write them to file. All options except 'Return an array of arrays' are real time. That means they do something with the data the moment they found before the question finish. The answer can be several Terabytes with billions of lines, and the Perl process will not consume more memory than a simple “hello world”. This cannot be done if 'Return an array of arrays' is set to ‘yes’ because at this case all data must be kept in memory , so do not set it for huge amount of returned data.

Functions ^

my $qry = Question->new( $SchemaFile );

Creates a new FastDB::Question object. its only argument is database schema file. The schema file is a unique file for every database, and it is created at first data load, inside data directory. Its name is "$DatabaseName.schema"

$qry->question( %hash )

Queries the database and do something with returned data. The %hash should have the keys

Fields to return

It is an array reference, of the columns you want to select. The column names are case sensitive. You can define as many columns you want, for example

[ 'TYPE', 'WEIGHT', 'COUNTRY', 'COLOR' ],

Filters

An array reference containing pairs of Columns and its assigned generic Perl code. They used for narrowing data selection.

You can have multiple Filters. Every Filter is assigned only to its column. At Perl code you can not write column names. The Filter's order is not important. Every Filter is simple Perl code. Every filter can have multipe code lines separated by the ; The last returned value of your code is examined if it is TRUE or FALSE. The special string DATA is replaced with column values. For example:

        'Filters' =>
        [
        'YEAR'   =>  'DATA == 2009'                       ,
        'MONTH'  =>  'DATA eq "12"'                       ,
        'DAY'    =>  'DATA eq "01"'                       ,
        'HOUR'   => '(DATA ge "07" ) and (DATA le "07")'  ,
        'MINUTE' => '(DATA ge "57" ) and (DATA le "57")'  ,
        'SECOND' => '(DATA ge "48" ) and (DATA le "48")'  ,
        'COL8'   =>  'DATA =~/^49/'
        ],

Filters work faster if their columns are indexed. Indexed columns are defined at first data load using the package FastDB::Load

Conditions

An array reference containing pieces of Perl code. They used for narrowing data selection. Inside the Perl code you write any column you want , and its name is replaced by its value. Every condition can have multipe code lines separated by the ; The last returned value of your code is examined if it is TRUE or FALSE. For example

        'Conditions' =>
                      [
                      '(WEIGHT >= 1500) and TYPE=~/2/'      ,
                      '((COLOR eq "green") or (COLOR eq "red")) and (COUNTRY eq "New Zeland")'
                      ], 

Conditions are slower than Filters and you must avoid them if you can have the same selection using only Filters. But as you can use all columns you can built more complex expressions than Filters. You can have multiple Conditions.

Results

An array reference containing (as hash) settings and their values concerning how the results of the question should evaluated. There setting are

        'Return an array of arrays'  =>  'Yes' or 'no'
        
                If it is yes then the found data are returned as an array of arrays.
                All found data are kept is system memory and returned all together when the question
                is finished. Be careful with this option when the estimated size of returned data are many gigabytes.
                A sampe code to dispatch the answer is
                
                foreach my $row (@{$answer}) { print "Columns: @{$row}\n" } 
        
        'Print to standard output'  =>  'Yes' or 'no'
        
                If it is yes then the found data are printed at standard output (usually the screen)
                at the moment they found while the question is still working. Does not consume system memory no matter how much the found data are.
        
        'Print to standard error'  =>  'Yes' or 'no'

                If it is yes then the found data are printed at standard error at the moment they found while the question is still working. Does not consume system memory no matter how much the found data are.

        'Print to file'  =>  'Yes' or 'no'
                
                If it is yes then the found data are printed at the specified file at the moment they found while the question is still working. Does not consume system memory no matter how much the found data are.
        
        'File name'  =>  $SomeFile
        
                The file name to print the found data. It does not have any effect if option 'Print to file' is set to No
        
        'Pass to external Perl module'  =>  'Yes' or 'no'
        
                If it is yes then the found data are passed to the external Perl module at the moment they found while the question is still working. Does not consume system memory no matter how much the found data are.
                The module's function is called for every found row. The passed data are the values of the columns defined at 'Fields to return'
        
        'Perl module name'  =>  $Module
        
                The Perl module to pass the found data. It does not have any effect if option 'Pass to external Perl module' is set to No
        
        'Function of the Perl module'  =>  $Module::$Function
        
                The function of the external Perl module to pass the found data. It does not have any effect if option 'Pass to external Perl module' is set to No
        
        'Code of how to pass data at function'  =>  perl code passing @_
        
                Here you can take control of how your row columns values will passes at function. You can use a sipmle '@_' to pass them as list;  'join ",", @_'  to pass the as one string or whatever else fits your needs.

here is a sample 'Results' key

'Results' => [ 'Return an array of arrays' => 'No' , 'Print to standard output' => 'No' , 'Print to standard error' => 'no' , 'Print to file' => 'Yes' , 'File name' => './answer.txt' , 'Pass to external Perl module' => 'No ' , 'Perl module name' => 'MIME::Base64' , 'Function of the Perl module' => 'encode_base64' , 'Code of how to pass data at function' => 'join ",", @_' ]

NOTES ^

Some usefull basic operators you can use for 'Filters' and 'Conditions' (there are much more)

        eq                string equal
        ne                string not equal
        ==                number equal
        !=                number not equal
        >                 number greater
        <                 number less
        >=                number greater or equal 
        <=                number less    or equal
        gt                string greater
        lt                string less
        ge                string greater or equal 
        le                string less    or equal 
        uc                upper case
        lc                lower case
        =~/something/     like      case sensitive
        =~/something/i    iike   no case sensitive

INSTALL ^

Because this module is implemented with pure Perl it is enough to copy FastDB directory somewhere at your @INC or where your script is. For your convenient you can use the following commands to install/uninstall the module

        Install:     setup_module.pl u-install   --module=FastDB

        Uninstall:   setup_module.pl u-uninstall --module=FastDB

AUTHORS ^

Author: gravitalsun@hotmail.com (George Mpouras)

COPYRIGHT ^

Copyright (c) 2011, George Mpouras, gravitalsun@hotmail.com All rights reserved.

This program is free software; you may redistribute it and/or modify it under the same terms as Perl itself.

syntax highlighting: