Treex::Block::Read::BaseAlignedReader - abstract ancestor for parallel-corpora document readers
version 0.08170
# in scenarios Read::MyAlignedFormat en=english.txt de=german.txt # Zones can differ also in selectors, any number of zones can be read Read::MyAlignedFormat en_ref=ref1,ref2 en_moses=mos1,mos2 en_tectomt=tmt1,tmt2
This class serves as a common ancestor for document readers that read more zones at once -- usually parallel sentences in two (or more) languages. The readers take parameters named as the zones and values of the parameters is a space or comma separated list of filenames to be loaded into the given zone. The class is designed to implement the Treex::Core::DocumentReader interface.
In derived classes you need to define the next_document method, and you can use next_filenames and new_document methods.
next_document
next_filenames
new_document
space or comma separated list of filenames, or - for STDIN.
-
How to name the loaded documents. This attribute will be saved to the same-named attribute in documents and it will be used in document writers to decide where to save the files.
This method must be overriden in derived classes. (The implementation in this class just issues fatal error.)
Returns a hashref of filenames (full paths) to be loaded. The keys of the hash are zone labels, the values are the filenames.
Returns a new empty document with pre-filled attributes loaded_from, file_stem, file_number and path which are guessed based on current_filenames.
loaded_from
file_stem
file_number
path
current_filenames
returns the last filenames returned by next_filenames
Returns the number of documents that will be read by this reader.
Treex::Block::Read::BaseReader Treex::Block::Read::BaseAlignedTextReader
Martin Popel
Copyright © 2011 by Institute of Formal and Applied Linguistics, Charles University in Prague
This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
1 POD Error
The following errors were encountered while parsing the POD:
Non-ASCII character seen before =encoding in '©'. Assuming UTF-8
To install Treex::Unilang, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Treex::Unilang
CPAN shell
perl -MCPAN -e shell install Treex::Unilang
For more information on module installation, please visit the detailed CPAN module installation guide.