The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Lucy::Search::Searcher - Base class for searching collections of documents.

SYNOPSIS

    # Abstract base class.

DESCRIPTION

Abstract base class for objects which search. Core subclasses include IndexSearcher and PolySearcher.

CONSTRUCTORS

new

    package MySearcher;
    use base qw( Lucy::Search::Searcher );
    sub new {
        my $self = shift->SUPER::new;
        ...
        return $self;
    }

Abstract constructor.

  • schema - A Schema.

ABSTRACT METHODS

doc_max

    my $int = $searcher->doc_max();

Return the maximum number of docs in the collection represented by the Searcher, which is also the highest possible internal doc id. Documents which have been marked as deleted but not yet purged are included in this count.

doc_freq

    my $int = $searcher->doc_freq(
        field => $field,  # required
        term  => $term,   # required
    );

Return the number of documents which contain the term in the given field.

  • field - Field name.

  • term - The term to look up.

collect

    $searcher->collect(
        query     => $query,      # required
        collector => $collector,  # required
    );

Iterate over hits, feeding them into a Collector.

  • query - A Query.

  • collector - A Collector.

fetch_doc

    my $hit_doc = $searcher->fetch_doc($doc_id);

Retrieve a document. Throws an error if the doc id is out of range.

  • doc_id - A document id.

METHODS

glean_query

    my $query = $searcher->glean_query($query);
    my $query = $searcher->glean_query();  # default: undef

If the supplied object is a Query, return it; if it’s a query string, create a QueryParser and parse it to produce a query against all indexed fields.

hits

    my $hits = $searcher->hits(
        query      => $query,       # required
        offset     => $offset,      # default: 0
        num_wanted => $num_wanted,  # default: 10
        sort_spec  => $sort_spec,   # default: undef
    );

Return a Hits object containing the top results.

  • query - Either a Query object or a query string.

  • offset - The number of most-relevant hits to discard, typically used when “paging” through hits N at a time. Setting offset to 20 and num_wanted to 10 retrieves hits 21-30, assuming that 30 hits can be found.

  • num_wanted - The number of hits you would like to see after offset is taken into account.

  • sort_spec - A SortSpec, which will affect how results are ranked and returned.

get_schema

    my $schema = $searcher->get_schema();

Accessor for the object’s schema member.

INHERITANCE

Lucy::Search::Searcher isa Clownfish::Obj.