Tie::FieldVals - an array tie for a file of enhanced Field:Value data
This describes version 0.6203 of Tie::FieldVals.
use Tie::FieldVals; use Tie::FieldVals::Row; # tie the array my @records; my $recs_obj = tie @records, 'Tie::FieldVals', datafile=>$datafile; # object methods my @field_names = $recs_obj->field_names();
This is a Tie object to map the records in an enhanced Field:Value data file into an array. Each file has multiple records, each record has its values defined by a Field:Value pair, with the enhancements that (a) the Value part can extend over more than one line (because the Field names are predefined) and (b) Fields can have multiple values by repeating the Field:Value part for a given field.
Because of its use of the Tie::File module, access to each record is reasonably fast. The Tie::File module also ensures that (a) the whole file doesn't have to be read into memory (b) record changes are written to the file straight away (c) record changes don't require the whole file to be rewritten, just the part of the file after the change.
The advantage of this setup is that one can have useful data files which are plain text, human readable, human editable, and at the same time able to be accessed faster than using XML (I know, I wrote a version of my reporting software using XML data, and even the fastest XML parsers weren't as fast as this setup, once there were a reasonable number of records). This also has advantages over a simpler setup where values are given one per line with no indication of what value belongs to what field; the problems with that is that it is harder to fix corrupted data by hand, and it is harder to add new fields, and one can't have multi-line data.
It is likewise better than a CSV (Comma-Separated Values) file, because again, with a CSV file, the data is positional and therefore harder to fix and harder to change, and again one can't have multi-line data.
This module is both better and worse than file-oriented databases like DB_File and its variants and extensions (such as MLDBM). This module does not require that each record have a unique key, and the fact that a DBM file is binary makes it not only less correctable, but also less portable. On the downside, this module isn't as fast.
Naturally, if one's data needs are more complex, it is probably better to use a fully-fledged database; this is oriented towards those who don't wish to have the overhead of setting up and maintaining a relational database server, and wish to use something more straightforward.
This comes bundled with other support modules, such as the Tie::FieldVals::Row module. The Tie::FieldVals::Select module is for selecting and sorting a sub-set from a Tie::FieldVals array, and the Tie::FieldVals::Join is a very simple method of joining two files on a common field.
This distribution includes the fv2xml script, which converts a Tie::FieldVals data file into an XML file, and xml2fv which converts an XML file into a Tie::FieldVals data file.
The data file is in the form of Field:Value pairs, with each record separated by a line with '=' on it. The first record is an "empty" record, which just contains the field names; this lets us know what the legal fields are. A line which doesn't start with a recognised field is considered to be part of the value of the most recent Field.
Name: Entry: = Name:fanzine Entry:Fanzines are amateur magazines produced by fans. = Name:fan fiction (fanfic) Entry:Original fiction written by fans of a particular TV Show/Movie set in the universe depicted by that work. =
The first record just contains Name: and Entry: fields to show that those are the legal fields for this file. The third record gives an example of a value that goes over more than one line.
Author: AuthorEmail: AuthorURL: AuthorURLName: = Author:Adele AuthorEmail:firstname.lastname@example.org AuthorEmail:email@example.com AuthorURL: AuthorURLName: = Author:Danzer,Brenda AuthorEmail: AuthorURL:http://www.example.com/~danzer AuthorURLName:Danzer Dancing AuthorURL:http://www.brendance.com/ AuthorURLName:BrenDance =
This one gives examples of multi-valued fields.
Field names cannot have spaces in them, indeed, they must consist of plain alphanumeric characters or underscores. They are case-sensitive.
The record separator (=) must be on a line by itself, and the last record in the file must also have a record-separator after it.
my @field_names = Tie::FieldVals::find_field_names($datafile);
Read the field-name information from the file, if the file exists and is readable.
Get the field names of this data.
my @field_names = $recs_obj->field_names();
Locks the data file. "MODE" has the same meaning as the second argument to the Perl built-in "flock" function; for example "LOCK_SH" or "LOCK_EX | LOCK_NB". (These constants are provided by the "use Fcntl ':flock';" declaration.)
"MODE" is optional; the default is "LOCK_EX".
When you use "flock" to lock the file, "Tie::FieldVals" assumes that the record cache is no longer trustworthy, because another process might have modified the file since the last time it was read. Therefore, a successful call to "flock" discards the contents of the record cache.
The best way to unlock a file is to discard the object and untie the array. It is probably unsafe to unlock the file without also untying it, because if you do, changes may remain unwritten inside the object. That is why there is no shortcut for unlocking. If you really want to unlock the file prematurely, you know what to do; if you don't know what to do, then don't do it.
See "flock" in Tie::File for more information (this calls the flock method of that module).
Create a new instance of the object as tied to an array.
tie @people, 'Tie::FieldVals', datafile=>$datafile; tie @people, 'Tie::FieldVals', datafile=>$datafile, mode=>O_RDONLY, cache_size=>1000, memory=>0; tie @people, 'Tie::FieldVals', datafile=>$datafile, fields=>[qw(Name Email)], mode=>(O_RDWR|O_CREAT); tie @people, 'Tie::FieldVals', datafile=>$datafile, mode=>O_RDWR, cache_all=>1;
The file with the data in it. (required)
Field defintions for creating a new file. This is ignored if the file already exists.
The mode to open the file with. O_RDONLY means that the file is read-only. O_RDWR means that the file is read-write. (default: O_RDONLY)
If true, cache all the records in the file. This will speed things up, but consume more memory. (default: false)
Note that this merely sets the cache_size to the size of the file when the tie is initially made: if you add more records to the file, the cache size will not be increased.
The size of the cache (if we aren't caching all the records). (default: 100) As ever, there is a trade-off between space and time.
The upper limit on the memory consumed by
Tie::File. (See Tie::File). (default: 10,000,000)
Note that there are two caches: the cache of unparsed records maintained by Tie::File, and the cache of parsed records maintained by Tie::FieldVals. The memory option affects the Tie::File cache, and the cache_* options affect the Tie::FieldVals cache.
Get a row from the array.
$val = $array[$ind];
Returns a reference to a Tie::FieldVals::Row hash, or undef.
Add a value to the array. Value must be a Tie::FieldVals::Row hash.
$array[$ind] = $val;
If $ind is bigger than the array, then just push, don't extend.
Get the size of the array.
Set the size of the array, if the file is writeable.
Delete the value at $ind if the file is writeable.
@array = ();
Clear the array if the file is writeable.
Untie the array.
This documentation is for developer reference only.
Set debugging on.
For debugging: say who called this
Set the field names in the data-file to be the given field names. (Assumes the file didn't exist before).
Test::More Carp Tie::Array Tie::File Fcntl Data::Dumper Getopt::Long Pod::Usage Getopt::ArgvFile File::Basename
To install this module, run the following commands:
perl Build.PL ./Build ./Build test ./Build install
Or, if you're on a platform (like DOS or Windows) that doesn't like the "./" notation, you can do this:
perl Build.PL perl Build perl Build test perl Build install
In order to install somewhere other than the default, such as in a directory under your home directory, like "/home/fred/perl" go
perl Build.PL --install_base /home/fred/perl
as the first step instead.
This will install the files underneath /home/fred/perl.
You will then need to make sure that you alter the PERL5LIB variable to find the modules, and the PATH variable to find the script.
Therefore you will need to change: your path, to include /home/fred/perl/script (where the script will be)
the PERL5LIB variable to add /home/fred/perl/lib
Please report any bugs or feature requests to the author.
Kathryn Andersen (RUBYKAT) perlkat AT katspace dot com http://www.katspace.com
Copyright (c) 2004-2008 by Kathryn Andersen
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.