Tony Bowden > Plucene > Plucene::Index::Writer

Download:
Plucene-1.25.tar.gz

Dependencies

Annotate this POD

CPAN RT

New  12
Open  5
View/Report Bugs
Source  

NAME ^

Plucene::Index::Writer - write an index.

SYNOPSIS ^

        my $writer = Plucene::Index::Writer->new($path, $analyser, $create);

        $writer->add_document($doc);
        $writer->add_indexes(@dirs);

        $writer->optimize; # called before close
        
        my $doc_count = $writer->doc_count;

        my $mergefactor = $writer->mergefactor;

        $writer->set_mergefactor($value);

DESCRIPTION ^

This is the writer class.

If an index will not have more documents added for a while and optimal search performance is desired, then the optimize method should be called before the index is closed.

METHODS ^

new

        my $writer = Plucene::Index::Writer->new($path, $analyser, $create);

This will create a new Plucene::Index::Writer object.

The third argument to the constructor determines whether a new index is created, or whether an existing index is opened for the addition of new documents.

mergefactor / set_mergefactor

        my $mergefactor = $writer->mergefactor;

        $writer->set_mergefactor($value);

Get / set the mergefactor. It defaults to 5.

doc_count

        my $doc_count = $writer->doc_count;

add_document

        $writer->add_document($doc);

Adds a document to the index. After the document has been added, a merge takes place if there are more than $Plucene::Index::Writer::mergefactor segments in the index. This defaults to 10, but can be set to whatever value is optimal for your application.

optimize

        $writer->optimize;

Merges all segments together into a single segment, optimizing an index for search. This should be the last method called on an indexer, as it invalidates the writer object.

add_indexes

        $writer->add_indexes(@dirs);

Merges all segments from an array of indexes into this index.

This may be used to parallelize batch indexing. A large document collection can be broken into sub-collections. Each sub-collection can be indexed in parallel, on a different thread, process or machine. The complete index can then be created by merging sub-collection indexes with this method.

After this completes, the index is optimized.

syntax highlighting: