
use Text::TEI::Collate; my $aligner = Text::TEI::Collate->new(); # Read from strings. my @collated_texts = $aligner->align( $string1, $string2, [ .. $stringN ] ); # Read from filehandles. my $fh1 = new IO::File; $fh1->open( $first_file, "<:utf8" ); my $fh2 = new IO::File; $fh2->open( $first_file, "<:utf8" ); # ... my @collated_from_fh = $aligner->align( $fh1, $fh2, [ .. $fhN ] );

Text::TEI::Collate is the beginnings of a collation program for multiple (transcribed) manuscript copies of a known text. It is an object-oriented interface, mostly for the convenience of the author and for the ability to have global settings.
The object is the alignment engine, or "aligner". The method that a user will care about is "align"; the other methods in this file are public in case a user needs a subset of this package's functionality.
An aligner takes two or more texts; the texts can either be strings or IO::File objects. It returns two or more arrays -- one for each text input -- in which identical and similar words are lined up with each other, via empty-string padding.
* TODO: describe word objects

Creates a new aligner object. Takes a hash of options; available options are listed.
This is the meat of the program. Takes a list of strings, or a list of IO::File objects. (The latter is useful if the text you are collating is particularly long.) Returns a list of collated texts. Currently each "text" is simply a list of words, padded for collation with empty strings; soon it will be a list of word objects which I have yet to describe.


Tara L Andrews <aurum@cpan.org>