Tomáš Kraut > Treex-Core-0.08663 > Treex::Core::DocumentReader

Download:
Treex-Core-0.08663.tar.gz

Dependencies

Annotate this POD

CPAN RT

Open  0
View/Report Bugs
Module Version: 0.08663   Source   Latest Release: Treex-Core-0.08664

NAME ^

Treex::Core::DocumentReader - interface for all document readers

VERSION ^

version 0.08663

DESCRIPTION ^

Document readers are a Treex concept how to load documents to be processed by Treex. The documents can be stored in files (in various formats) or read from STDIN or retrieved from a socket etc.

METHODS ^

To be implemented

These methods must be implemented in classes that consume this role.

next_document

Return next document (Treex::Core::Document).

number_of_documents

Total number of documents that will be produced by this reader. If the number is unknown in advance, undef should be returned.

Already implemented

is_current_document_for_this_job

Is the document that was most recently returned by $self-next_document()> supposed to be processed by this job? Job indices and document numbers are 1-based, so e.g. for jobs = 5, jobindex = 3 we want to load documents with numbers 3,8,13,18,... jobs = 5, jobindex = 5 we want to load documents with numbers 5,10,15,20,... i.e. those documents where (doc_number-1) % jobs == (jobindex-1).

next_document_for_this_job

Returns a next document which should be processed by this job. If jobindex is set, returns "modulo number of jobs". See is_current_document_for_this_job.

number_of_documents_per_this_job

Total number of documents that will be produced by this reader for this job. It's computed based on number_of_documents, jobindex and jobs.

restart

Start reading again from the first document. This implementation just sets the attribute doc_number to zero. You can add additional behavior using the Moose after 'restart' construct.

SEE ALSO ^

Treex::Block::Read::Sentences Treex::Block::Read::Text Treex::Block::Read::Treex

AUTHOR ^

Martin Popel <popel@ufal.mff.cuni.cz>

COPYRIGHT AND LICENSE ^

Copyright © 2011 by Institute of Formal and Applied Linguistics, Charles University in Prague

This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

syntax highlighting: