Lucy::Index::DataWriter - Write data to an index.
# Abstract base class.
DataWriter is an abstract base class for writing index data, generally in segment-sized chunks. Each component of an index – e.g. stored fields, lexicon, postings, deletions – is represented by a DataWriter/DataReader pair.
Components may be specified per index by subclassing Architecture.
my $writer = MyDataWriter->new( snapshot => $snapshot, # required segment => $segment, # required polyreader => $polyreader, # required );
Abstract constructor.
snapshot - The Snapshot that will be committed at the end of the indexing session.
segment - The Segment in progress.
polyreader - A PolyReader representing all existing data in the index. (If the index is brand new, the PolyReader will have no sub-readers).
$data_writer->add_segment( reader => $reader, # required doc_map => $doc_map, # default: undef );
Add content from an existing segment into the one currently being written.
reader - The SegReader containing content to add.
doc_map - An array of integers mapping old document ids to new. Deleted documents are mapped to 0, indicating that they should be skipped.
$data_writer->finish();
Complete the segment: close all streams, store metadata, etc.
my $int = $data_writer->format();
Every writer must specify a file format revision number, which should increment each time the format changes. Responsibility for revision checking is left to the companion DataReader.
$data_writer->delete_segment($reader);
Remove a segment’s data. The default implementation is a no-op, as all files within the segment directory will be automatically deleted. Subclasses which manage their own files outside of the segment system should override this method and use it as a trigger for cleaning up obsolete data.
reader - The SegReader containing content to merge, which must represent a segment which is part of the the current snapshot.
$data_writer->merge_segment( reader => $reader, # required doc_map => $doc_map, # default: undef );
Move content from an existing segment into the one currently being written.
The default implementation calls add_segment() then delete_segment().
my $hashref = $data_writer->metadata();
Arbitrary metadata to be serialized and stored by the Segment. The default implementation supplies a hash with a single key-value pair for “format”.
my $snapshot = $data_writer->get_snapshot();
Accessor for “snapshot” member var.
my $segment = $data_writer->get_segment();
Accessor for “segment” member var.
my $poly_reader = $data_writer->get_polyreader();
Accessor for “polyreader” member var.
my $schema = $data_writer->get_schema();
Accessor for “schema” member var.
my $folder = $data_writer->get_folder();
Accessor for “folder” member var.
Lucy::Index::DataWriter isa Clownfish::Obj.
To install Lucy, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Lucy
CPAN shell
perl -MCPAN -e shell install Lucy
For more information on module installation, please visit the detailed CPAN module installation guide.