The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

makedb - generate, update or remove wais databases

SYNOPSIS

makedb [[-clean] -tidy] [-update] [-config config_file] [-test] [-debug] [-verbose] [-copy tmpdir] ([-all] | database ...)

DESCRIPTION

makedb creates, updates or removes databases specified in a makedb config file (./makedb.conf unless overwritten by the -config option).

OPTIONS

Note that all options may be abreviated with a uniquely identifying prefix.

-clean -tidy

Delete databases. This option can be used together with the -update option. Deletion is done before the update regardless of the order ogf options on the command line :-). Files with extension src, fmt, fde, syn, stop, and cat will not be removed unless -tidy is given too.

-config config_file

Read an alternate config file. Default is ./makedb.conf.

-update

Update the databases.

-all

Do clean/update all databases specified in the config file. If not given clean/update all databases specified on the command line.

-test

Do nothing. Just print actions.

-copy tmpdir

Do the actual indexing in tmpdir. Copy the database to tmpdir, run the index commands and copy the result back.

-debug

Not implemented yet.

-verbose

Additional messages to stderr.

Config File

The config file should be made up of lines assigning values to variables as in:

    waisindex = /usr/local/ls6/wais/bin/waisindex

Each assignment must start in column 1. Shell comments are allowed. Some of the variables have predefined meaning. There are global and local variables. Local variables are instantiated for each database. Each database = assignment introduces a new local block. Use the -verbose option if you are unsure about the scoping. Assignments may have the form variable += value in which case the value is appended to variable.

The following variables are global. The last occurance in the file counts.

waisindex

Path to the waisindex program. See example above.

wais_opt

Options for all waisindex runs. For example:

    wais_opt  = -nocat
fmtdir

Directory where to look for database.fmt if it does not exist in dbdir. Also database.src, database.fde, database.syn, database.stop and database.cat are copied unless they exist in dbdir.

The following variables are local to a database block. The last occurance up to the end of the block counts. For limit, dbdir and options there can be global defaults (given before the current block). When leaving a block these values are restored.

database

The name of the database.

files

A list of shell fileglob expressions as in:

    files  = /usr/local/doc/*.html
    files += /usr/local/doc/*.doc

You may also use backticks (`) but no double quotes ("):

    files = `find $dbdir -name make\* -print`
options

Additional wasindex options. For example

    options = -t fields
dbdir

The directory in which the wais database lives.

limit

The number of dead files which should be tolerated in the index. A dead file is a file which was in the index, changed and then re-indexed. Since the index does not provide deletions, the file is removed from the filename table instead. All postings remain in the index thus occupying space on the disc and slowing down the search. Also the global occurence counter for terms in the file have too high values thus twisting final weights for hits. When more than limit files are killed this way, makedb regenerates the whole index. This will take more time than simply updating but the index size is reduced and searches will be faster. So set limit to make your tradeoff. limit defaults to 100.

All other variables do not have any meaning to makedb unless you use them in the value part of an assignment as in:

        docdir    = /home/robots/wais/wais-docs
        database  = test
        files     = $docdir/TEST

EXAMPLE

        # makedb.conf -- makdb configuration file

        # Global options
        dbdir     = /home/robots/wais/wais-sources
        waisindex = /usr/local/ls6/wais/bin/waisindex
        wais_opt  = -nocat                 # don't create catalog files
        limit     = 10                     # 10 dead files maximum

        # User defined variables
        docdir    = /home/robots/wais/wais-docs

        # the databases
        database  = bibdb-html
        files     = $docdir/bibdb.html     # use of variables in the value
        limit     = 0                      # no dead files
        options   = -T HTML -t  fields

        database  = journals
        files     = $docdir/journals/*
        limit     = 3
        options   = -t  fields

        database  = www-pages
        wwwroot   = /home/robots/www/pages # new global variable
        files     = `find $wwwroot -name \*.html -print`
        options   = -t URL $wwwroot http:

        database  = test
        dbdir     = /home/crew/pfeifer/tmp/wittenberg
        files     = $dbdir/ma*
        files    += $dbdir/te*             # append
        options   = -t text

AUTHOR

Ulrich Pfeifer <pfeifer@ls6.informatik.uni-dortmund.de>