
Parse::File::Metadata - For plain-text files that contain both metadata and data records, parse metadata first

use Parse::File::Metadata;
$metaref = {};
@rules = (
{
rule => sub { exists $metaref->{d}; },
label => q{'d' key must exist},
},
{
rule => sub { $metaref->{d} =~ /^\d+$/; },
label => q{'d' key must be non-negative integer},
},
{
rule => sub { exists $metaref->{f}; },
label => q{'f' key must exist},
},
);
$self = Parse::File::Metadata->new( {
file => 'path/to/myfile',
header_split => '\s*=\s*',
metaref => $metaref,
rules => \@rules,
} );
$dataprocess = sub { my @fields = split /,/, $_[0], -1; print "@fields\n"; };
$self->process_metadata_and_proceed( $dataprocess );
$self->process_metadata_only();
$metadata_out = $self->get_metadata();
$exception = $self->get_exception();

This module is useful when you have to parse a plain-text file that meets the following conditions:
Below is a plain-text file in which the header consists of key-value pairs delimited by = signs. The key is the to the left of the first delimiter. Everything to the right is part of the value (including any additional delimiter characters).
The body consists of comma-delimited strings. Whether in the body or the header, comments begin with a # sign and are ignored.
# comment
a=alpha
b=beta,charlie,delta
c=epsilon zeta eta
d=1234567890
e=This is a string
f=,
some,body,loves,me
I,wonder,wonder,who
could,it,be,you
Suppose you are told that you should proceed to parse the body if and only if the following conditions are met in the header:
d.d must be a non-negative integer.f.This file would meet all three criteria and the program would proceed to parse the three data records.
If, however, metadata element f were commented out:
#f=,
the file would no longer meet the criteria and the program would cease before parsing the data records.

new()Parse::File::Metadata constructor. Validates input.
$self = Parse::File::Metadata->new( {
file => 'path/to/myfile',
header_split => '\s*=\s*',
metaref => $metaref,
rules => \@rules,
} );
Single hash reference. Hash has the following elements:
file
Path, relative or absolute, to the file needing parsing.
header_split
Hard-quoted string holding a Perl 5 regex to be used for parsing metadata records.
metaref
Empty hash-reference.
rules
Reference to an array of hashrefs. Each such hashref has two elements:
rule
Reference to a subroutine describing a criterion which the header must pass before parsing of the body begins. The subroutine returns a true value when the criterion is met and an undefined value when the criterion is not met.
label
A human-friendly string which will be used to populate exceptions if the criteria are not met.
The rules are applied in the order specified in the array.
Parse::File::Metadata object.
process_metadata_and_proceed()Process metadata rows found in file header and test the resulting hash against the criteria specified in the rules. If all criteria are met, proceed to parse the data rows with the subroutine specified as argument to this method.
$dataprocess = sub { my @fields = split /,/, $_[0], -1; print "@fields\n"; };
$self->process_metadata_and_proceed( $dataprocess );
None. Use get_metadata() and get_exception() methods to obtain that data.
process_metadata_only()Same as process_metadata_and_proceed, except that it returns before beginning any processing of the data records.
$self->process_metadata_only();
None.
get_metadata()Access metadata in file's header section.
$metadata_out = $self->get_metadata()
None.
Hash of metadata found in file's header.
get_exception()Access reasons, if any, why file failed to meet specified criteria.
$exception = $self->get_exception()
None.
Reference to an array holding lists of labels for rules on which the metadata fails.


James E Keenan
CPAN ID: jkeenan
Perl Seminar NY
jkeenan@cpan.org
http://thenceforward.net/perl/modules/Parse-File-Metadata

Copyright 2010 James E Keenan
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
The full text of the license can be found in the LICENSE file included with this module.

perl(1).