The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Search::Glimpse::Index - Interface to glimpseindex

SYNOPSIS

  use Search::Glimpse::Index;

  my %opt = (
      timeindex   => 1,
      dryrun      => 0,
      indexall    => 0,
      indexnum    => 0,
      incremental => 0,
      structural  => 0,
      destdir     => "$ENV{HOME}/myindexes",
      stopword    => 90,     # must appear in 90% of files
  );
  my $indexer = Search::Glimpse::Index( %opt );

  $indexer->index("/path/to/folder/to/index");

DESCRIPTION

This module is a Perl interface to glimpseindex binary. It (hopefully) makes easier to use the application from within Perl scripts or modules.

Available Methods

new

The constructor receives a hash with the indexing options to use. Note that all these values have sensible defaults (mosty, the glimpseindex defaults). Although I describe briefly what each option represent, I suggest to read the complete manpage for glimpseindex.

Known options are:

destdir (glimpseindex -H option)

This is the folder where glimpseindex will store its index files. This is also the path where you should put your exclude/include files. Future versions of this module might include an interface for those files.

dryrun (glimpseindex -I option)

This option is a boolean value, and sets whether glimpseindex should really index the files or just output the files that would be indexed in a real run.

bigindex (glimpseindex -b option)

glimpseindex has three different index sizes. By default the medium index is used (glimpseindex -o). Use this option for bigger indexes and (hopefully) faster results.

smallindex

glimpseindex has three different index sizes. By default the medium index is used (glimpseindex -o). Use this option for smaller indexes (not using any glimpseindex switch).

indexnum (glimpseindex -n option)

By default, tokens with digits are not indexed. Therefore, things like abc123 or a date will not be indexed. Use this option to force tokens with digits to be indexed.

indexall (glimpseindex -E option)

Makes glimpseindex to index all files, independently of their file type. Note that glimpseindex will honor .glimpse_exclude files.

timeindex (glimpseindex -t option)

This option is only available for glimpse version 3.5 or newer. It changes the order by which files are indexed. By default files are indexed in a mostly arbitraty order. With this option (which doesn't work in smallindex mode), the index will store files in a reversed order of modification time (recent files first). Therefore, results of queries are returned by this order, and glimpse is able t filter results by age.

incremental (glimpseindex -f option)

Useful if you have run a glimpseindex earlier and need to reindex. This option will perform an incremental indexing. If there is no current index or if this procedure fails, glimpseindex automatically reverts to the default mode (which is to index everything from scratch).

structural (glimpseindex -s option)

Use this option if you want to support structured queries.

swsize (glimpseindex -S option)

This option is used to control the amount of stop words to be considered. For further details on how the values of this option behave, please check glimpseindex manpage.

index

Use with a path to be indexed.

SEE ALSO

perl(1)

AUTHOR

Alberto Manuel Brandão Simões, <ambs@cpan.org>

COPYRIGHT AND LICENSE

Copyright (C) 2011 by Alberto Manuel Brandão Simões