Plucene::SearchEngine::Query - A higher level abstraction for Plucene
use Plucene::SearchEngine::Query; my $query = Plucene::SearchEngine::Query->new( dir => "/var/plucene/foo" ); my @docs = $queryer->search("some stuff"); for my $id (@docs) { $snippeter = $query->snippeter( retrieve_text_for_doc($id) ); print "<H1>Doc $id </H1>\n"; print "<BLOCKQUOTE>" . $snippeter->as_html . "</BLOCKQUOTE>"; }
Plucene is an extremely powerful library for building search engines, but each time I build a search engine with it, I always find myself doing the same things. This module provides an abstraction layer around Plucene - not quite as abstracted as Plucene::Simple, but more abstracted than Plucene itself.
Plucene::SearchEngine::Query->new( dir => "/var/plucene/foo", analyzer => "Plucene::Analysis::SimpleAnalyzer", default => "text", expand_docs => sub { shift; @_ }, snippeter => "Text::Context"; )
This prepares for searching the index. The only mandatory argument is dir, which tells Plucene where the index is to be found. The expand_docs and snippeter arguments are explained below; analyzer specifies which Plucene analysis class to use when tokenising the search terms, and the default argument denotes the default field for unqualified query terms.
dir
expand_docs
snippeter
analyzer
default
@docs = $queryer->search("foo bar");
Returns a set of documents matching the search query. The default way of "expanding" these search results is to sort them by score, and then return the value of the id field from the Plucene index.
id
Those more familiar with Plucene can have alternative data structures returned by providing a different expand_docs parameter to the constructor. For instance, the default doesn't actually return the score, so if you want to get at it, you can say:
expand_docs => sub { my ($self, @docs) = @_; return @docs }
This will return a list of array references; the first element in each array ref will be the Plucene::Document object, and the second will be the score.
Plucene::Document
Or, if you're dealing with Class::DBI-derived classes, you might like to try:
Class::DBI
expand_docs => sub { my ($self, @docs) = @_; sort { $b->date <=> $a->date } # Sort by date descending map { My::Class->retrieve($_->[0]->get("id")->string) } @docs; }
The choice is yours.
$self->snippeter($doc_text)
Given the searchable text of a document, returns a snippeter class (Text::Context, Text::Context::Porter, etc.) object primed with the positive parts of the query.
Text::Context
Text::Context::Porter
When you call the rendering method (say, as_html) on this object, you'll get the text snippet highlighting where the search terms appear in the document.
as_html
Simon Cozens, simon@cpan.org
simon@cpan.org
Plucene::SearchEngine::Index, Plucene, Plucene::Simple.
To install Plucene::SearchEngine, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Plucene::SearchEngine
CPAN shell
perl -MCPAN -e shell install Plucene::SearchEngine
For more information on module installation, please visit the detailed CPAN module installation guide.