
cpanest - generate an Hyper Estraier index for CPAN

cpanest [-clean] [-noclean] [-cpan url or directory] [-node node_uri] [-force] [-noforce] [-keep directory] [-match regexp] [-test level] [-trust_mtime] [-notrust_mtime]

This is a port of cpanwait from WAIT perl search engine to node API of Hyper Estraier.
All the hard work was done by Ulrich Pfeifer who wrote all parsers and formatters. I just added support for Hyper Estraier back-end after.
This documentation is somewhat incomplete and off-the-sync with code.

Clean the table befor indexing. Default is off.
Default directory or URL for indexing.
If an URL is given,
there currently must be a file indices/find-ls.gz relative to it which contains the output of find .
-ls | gzip.
Default is ftp://ftp.rz.ruhr-uni-bochum.de/pub/CPAN.
Specify node URI
Force reindexing, even if cpan thinks files are up to date. Default is off
If fetching from a remote server, keep files in directory. Default is /app/unido-i06/src/share/lang/perl/96a/CPAN/sources.
Limit to patches matching regexp. Default is authors/id/.
Set test level, were 0 means normal operation, 1 means, don't really index and 2 means, don't even get archives and examine them.
If on, the files mtimes are used to decide, which version of an archive is the newest. If b<off>, the version extracted is used (beware, there are far more version numbering schemes than cpan can parse).

Ulrich Pfeifer <pfeifer@ls6.informatik.uni-dortumund.de>
Dobrica Pavlinusic <dpavlin@rot13.org>

Copyright (c) 1996-1997, Ulrich Pfeifer
Copyright (c) 2005, Dobrica Pavlinusic

HyperEstraier::WAIT::Table

This is a mode that emulates WAIT::Table functionality somewhat.
There are some limitations and only one key attribute is supported (and used for @uri).
Since only one key is supported (and used as @uri attribute),
use first parametar of keyset as key.
Full text index is specified as invindex,
but you need just name of fields.
You will probably need to add
use WAIT::Parse::Base;
to your code after you remove WAIT::Config and WAIT::Database.

my $tb = new HyperEstraier::WAIT::Table(
uri => 'http://localhost:1978/node/cpan',
attr => qw/docid headline source size parent/,
key => 'docid',
invindex => qw/name synopsis bugs description text environment example author/,
);
if ( $tb->have(docid => $something) ) ...
my $key = $tb->insert(
docid => $base,
headline => 'Something',
...
);
$tb->delete_by_key($key);
$tb->delete( docid => $did, ... );