The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Config::ReadAndCheck - Perl module for parsing generic config files conforms to predefined line-by-line-based format.

Version 0.03

SYNOPSIS

  # This code could be used for parsing
  # the windows-style INI files.

  use strict;
  use Config::ReadAndCheck;

  my $FileName = shift or die "Usage: $0 <FileName>\n";

  my %ParsINI = ();

  # The lines started from ';' or '#' and empty lines will be ignored
  $ParsINI{'Comment'}   = {'Pattern' => '(?:\s*(?:(?:[\;\#]).*)*)',
                           'Type'    => 'ignore',
                          };

  # Sections have to have a '[SectionName]' form.
  # SectionName cannot be empty
  # SectionName has to be unique
  # The first line which is not a parameter definition is end of the section
  # Comments are allowed inside of the section (!)
  # At least one section has to be defined because of lack of 'Default' definition
  $ParsINI{'Section'}   = {'Pattern' => '\s*\[(.+)\]'.$ParsINI{'Comment'}->{'Pattern'},
                           'Type'    => 'UniqList',
                           'SubSection' => {'Params'  => {}, # Defined latter
                                            'Comment' => $ParsINI{'Comment'},
                                           },
                          };

  # Parameters have to have a 'ParamName=Value' form.
  # All leading o trailing spaces are ignored.
  # All spaces around the '=' sign are ignored
  # ParamName can not contain '=' sign and can not be empty
  # ParamName has to be unique in the section
  # The default 'Process' function is used.
  # Empty (no parameters) sections are allowed by the 'Default' definition
  $ParsINI{'Section'}->{'SubSection'}->{'Params'} =
          {'Pattern' => '\s*([^\=]+)\s*=\s*([^\s](?:.*[^\s])?)'.$ParsINI{'Comment'}->{'Pattern'},
           'Type'    => 'UniqList',
           'Default' => {},
          };

  # Create the parser object.
  # '%ParsINI' will be automaticaly checked for consistency
  my $Parser = Config::ReadAndCheck->new('Params' => \%ParsINI)
        or die "Can not create the parser: $@\n";

  # Parse the INI file. Parsing is case-insensitive by default
  my $Result = $Parser->ParseFile($FileName)
        or die "Error parsing file \"$FileName\": $@";

  # The I<C<$Result>> will be a reference to the hash with the followin structure:
  # 
  #   {'SectionName1' => {'ParamName1' => 'Value1',
  #                       'ParamName2' => 'Value2',
  #                       ...
  #                      },
  #    'SectionName2' => {'ParamName1' => 'Value1',
  #                       'ParamName2' => 'Value2',
  #                       ...
  #                      },
  #    ...
  #   }
  
  print Config::ReadAndCheck::PrintList($Result, '', "\t");

DESCRIPTION

This module provides a way to easily create a parser for your own file format and check the parsed values on the fly.

The Config::ReadAndCheck methods

new(%Config)

Returns a reference to the Config::ReadAndCheck object. %Config is a hash containing configuration parameters.

Configuration parameters are:

CaseSens

Optional parameter. If value is 'true' the input line identification will be case-sensitive. Default action is case-insensitive.

Params

The value has to be the reference to the "section definition hash".

Section definition

The structure is:

  my $Params = {'ParamName1' => $ParamDefinition1,
                'ParamName2' => $ParamDefinition2,
                ...
                'EndOfSection' => $ParamDefinition3,
               };

The 'ParamName1', 'ParamName2' are the names of parameters.

The $ParamDefinition1, $ParamDefinition2 are the reference to the "parameter definition hash".

'EndOfSection' is a reserved parameter name (see 'SubSection').

Each parameter will be represented in the result hash as a value with key the same as parameter name. The type of the value depends on 'Type' field in the parameter definition (see below).

Parameter definition hash

The structure is:

  my $ParamDefinition = {'Pattern' => 'The pattern string',
                         'Process' => $ProcessSubroutine,
                         'Default' => 'Value',
                         'Type'    => $ParamType,
                         'SubSection'   => $RefToSection,
                        };
'Pattern'

The perl regexp is used to identify the input line as a relative to this parameter. The '\A' escape sequence will be added to the beginning of the pattern and '\Z' will be added to the end automatically. '\n' symbols will be striped out from the line before evaluation. The evaluation will be done case sensitive or insensitive according to the 'CaseSens' parameter of the new() method.

'Process'

The reference to your very own parameter check and preparation subroutine. This subroutine which is called without parameters. $1, $2 and so on will be set according to your pattern. Process subroutine has to return one or two elements list. Number and type of elements depends on $ParamDefinition->{'Type'} and $ParamDefinition->{'SubSection'}.

Empty list means the 'line did not pass the check'. In this case Process subroutine can pass the error message to the parser: just set the $@ variable.

If $ParamDefinition->{'Process'} is not defined the simple sub{return ($1,$2);} subroutine will be used.

Process subroutine can pass the error

'Default'

The default value for this parameter. The type of this property depends on 'Type' and 'SubSection'. If $ParamDefinition->{'Default'} does not exist the parameter is treated as 'required' (see CheckRequired()).

'Type'

The type of the parameter. Can be 'UNIQ', or 'UNIQLIST', or 'LIST', or 'IGNORE'.

UNIQ

Only one line corresponding to the pattern has to be presented in the input.

The UNIQ value will be represented as single value in the result hash. This will be a first value in the list returned by the process subroutine.

UNIQLIST

Multiple lines corresponding to the pattern can be presented in the input. The process subroutine for this type has to return a list of two values.

The UNIQLIST parameter will be represented in the result hash as a reference to hash. The first value returned by process subroutine will be used as a hash key and the second will be a value. So, the first value returned by the process subroutine has to be uniq for each line corresponded to the pattern.

LIST

Multiple lines corresponding to the pattern can be presented in the input.

The LIST parameter will be represented in the result hash as a reference to array. The first value returned by the process subroutine will be pushed to this array for each line corresponded to the pattern. So, nothing unique at all.

IGNORE

Multiple lines corresponding to the pattern can be presented in the input. All them will not be presented in the result hash.

Type name is case-insensitive

'SubSection'

The reference to the "section definition".

If 'SubSection' is defined, the 'Process' subroutine has to return the reference to the hash, even empty as a first list element for types UNOQ and LIST, and as a second element for type UNIQLIST.

The parameter with 'SubSection' defined will be represented in the result hash as a reference to hash.

The parameters defined in the 'SubSection' will be represented in this hash with their own names.

The level of recursion is not limited but loops are prohibited.

If 'SubSection' is defined, the line corresponding to 'Pattern' will be treated as a first line of the enclosed section.

The line corresponding to the 'EndOfSection' parameter of the enclosed section will be treated as a last line in the subsection. The next line will be verified by the parameters of the parent section.

If no 'EndOfSection' parameter is defined in the subsection, the first line which does not correcpond to any of the subsection parameters will be treated as an end of subsection. Also, this line will be passed for the verification to the parent section.

Note: the root section also can have a 'EndOfSection' parameter. It will be treated as an 'EOF'.

The new() method returns a reference to the Config::ReadAndCheck object or 'undef' value.

Result()

Returns a copy of current result of parsing as a hash or reference to hash in scalar context.

Reset()

Remove all the data relative to previous parsing from the memory and make the parser ready for next parsing. Returns 'undef'.

Params()

Returns a copy of 'Params' hash currently in use or reference to hash in scalar context.

Parse(ARRAYREF)

$Array is a reference to array of strings to be parsed.

Then reach the EOF or 'EndOfSection' ParseArray() calls the CheckRequired() function to check if all required parameters were defined.

ParseFile($FileName)

$FileName is the name of file to be parsed.

Then reach the EOF or 'EndOfSection' Parse() calls the CheckRequired() function to check if all required parameters were defined.

Parse(CODEREF)

CODEREF is a reference to the subroutine, which returns the next string.

&{CODEREF}() will be called without any parameters and have to return a string. It have to return an undef value as an 'EOF' indication.

Parse($String)

$String is just string. The tokens (.*\n) will be extracted from this string and parsed one by one.

ParseIncremental($Str)

$Str is a string to be parsed. ParseIncremental() returns a name of the parameter which is $Str correspond to or undef if string is unrecognised.

$@ will contain an error message.

CheckRequired()

CheckRequired() checks whether all the parameters which do not have a 'Default' value provided exist in the $Result hash. It stops on the first one which does not and returns a false value. $@ variable contains a string "Required parameter PARAMETER_NAME is not defined".

If no 'problematic' parameters are found CheckRequired() returns a true value.

In addition to this check, CheckRequired() sets all undefined parameters to their 'Default' value.

PrintList($List, $Prefix, $Shift)

$List is a hash or array reference. $Prefix is a prefix substring. $Shift is a 'shift' substring (see below).

PrintList() produces a string which contains a human readable representation of a hash or array.

It is descending to the any hash or array references in the list. Embedded records are shifted for the one or more (according to level of embedment) $Shift substrings.

All records preceded by the $Prefix substring.

For example

  my @Tst = ('p 0.0',
             'p 0.1',
             {'p 1.0' => 'here',
              'p 1.1' => 'here too',
              'p 1.2' => ['p 2.0',
                          'p 2.1']},
             'p 0.3');
  print PrintList(\@Tst, '>', "\t");

will print

  >[0]    =  "p 0.0"
  >[1]    =  "p 0.1"
  >[2] hash
  >       'p 1.1' => "here too"
  >       'p 1.2' array
  >               [0]     =  "p 2.0"
  >               [1]     =  "p 2.1"
  >       'p 1.0' => "here"
  >[3]    =  "p 0.3"

All methods including new() returns an 'undef' value in case of error. The $@ variable will contain an error explanation.

EXPORT

None by default.

:print

PrintList()

AUTHOR

Daniel Podolsky, <tpaba@cpan.org>

SEE ALSO

Tie::IxHash.