The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

KinoSearch::Searcher - Execute searches.

SYNOPSIS

    my $searcher = KinoSearch::Searcher->new(
        invindex => MySchema->read('/path/to/invindex'),
    );
    my $hits = $searcher->search( 
        query      => 'foo bar' 
        offset     => 0,
        num_wanted => 100,
    );

DESCRIPTION

Use the Searcher class to perform search queries against an invindex.

Searcher's behavior is closely tied to that of KinoSearch::Index::IndexReader. If any of these criteria apply to your application, please consult IndexReader's documentation:

  • Persistent environment (e.g. mod_perl, FastCGI).

  • Index located on shared filesystem, such as NFS.

  • Incremental updates.

METHODS

new

    my $searcher = KinoSearch::Searcher->new(
        invindex => MySchema->read('/path/to/invindex'),
    );
    # or...
    my $searcher = KinoSearch::Searcher->new( reader => $reader );

Constructor. Takes labeled parameters. Either invindex or reader is required.

    my $hits = $searcher->search( 
        query      => $query,     # required
        offset     => 20,         # default: 0
        num_wanted => 10,         # default: 10
        filter     => $filter,    # default: undef (no filtering)
        sort_spec  => $sort_spec, # default: undef (sort by relevance)
    );

Process a search and return a Hits object. search() expects labeled hash-style parameters.

  • query - Can be either an object which subclasses KinoSearch::Search::Query or a query string. If it's a query string, it will be parsed using a QueryParser and a search will be performed against all indexed fields in the InvIndex. For more sophisticated searching, supply Query objects, such as TermQuery and BooleanQuery.

  • offset - The number of most-relevant hits to discard, typically used when "paging" through hits N at a time. Setting offset to 20 and num_wanted to 10 retrieves hits 21-30, assuming that 30 hits can be found.

  • num_wanted - The number of hits you would like to see after offset is taken into account.

  • filter - An object which isa KinoSearch::Search::Filter, such as a QueryFilter, RangeFilter, or PolyFilter. Search results will be limited to only those documents which pass through the filter.

  • sort_spec - Must be a KinoSearch::Search::SortSpec, which will affect how results are ranked and returned.

get_reader

    my $reader = $searcher->get_reader;

Return the Searcher's inner IndexReader.

set_prune_factor

    $searcher->set_prune_factor(10);

Experimental, expert API.

set_prune_factor() enables a lossy, heuristic optimization which can yield significantly improved performance at the price of a small penalty in relevance. It is only useful when 1) you have a way of establishing an absolute rank for all documents -- e.g. page score, date of publication, price; and 2) that primary ranking heavily influences which documents you want returned. Schema->pre_sort is used to control this sort order.

prune_factor is a multiplier which affects how prematurely searching a particular segment terminates. 10 is a decent default.

COPYRIGHT

Copyright 2005-2007 Marvin Humphrey

LICENSE, DISCLAIMER, BUGS, etc.

See KinoSearch version 0.20.