NAME

Net::Dict - client API for accessing dictionary servers (RFC 2229)

SYNOPSIS

    use Net::Dict;
    
    $dict = Net::Dict->new('dict.server.host');
    $h = $dict->define("word");
    foreach $i (@{$h}) {
        ($db, $def) = @{$i};
        . . .
    }

DESCRIPTION

Net::Dict is a perl class for looking up words and their definitions on network dictionary servers. Net::Dict provides a simple DICT client API for the network protocol described in RFC2229. Quoting from that RFC:

  • The Dictionary Server Protocol (DICT) is a TCP transaction based query/response protocol that allows a client to access dictionary definitions from a set of natural language dictionary databases.

An instance of Net::Dict represents a connection to a single DICT server. For example, to connect to the dictionary server at dict.org, you would write:

    $dict = Net::Dict->new('dict.org');

A DICT server can provide any number of dictionaries, which are referred to as databases. Each database has a name and a title. The name is a short identifier, typically just one word, used to refer to that database. The title is a brief one-line description of the database. For example, at the time of writing, the dict.org server has 11 databases, including a version of Webster's dictionary from 1913. The name of the database is web1913, and the title is Webster's Revised Unabridged Dictionary (1913).

To look up definitions for a word, you use the define method:

    $dref = $dict->define('banana');

This returns a reference to a list; each entry in the list is a reference to a two item list:

    [ $dbname, $definition ]

The first entry is a database name as introduced above. The second entry is the text of a definition from the specified dictionary.

MATCHING WORDS

In addition the looking up word definitions, you can lookup a list of words which match a given pattern, using the match() method. Each DICT server typically supports a number of strategies which can be used to match words against a pattern. For example, using prefix strategy with a pattern "anti" would find all words in databases which start with "anti":

    @mref = $dict->match('anti', 'prefix');
    foreach my $match (@{ $mref })
    {
        ($db, $word) = @{ $match };
    }

Similarly the suffix strategy is used to search for words which end in a given pattern. The strategies() method is used to request a list of supported strategies - see "METHODS" for more details.

SELECTING DATABASES

By default Net::Dict will look in all databases on the DICT server. This is specified with a special database name of *. You can specify the database(s) to search explicitly, as additional arguments to the define and match methods:

    $dref = $dict->define('banana', 'wn', 'web1913');

Rather than specify the databases to use every time, you can change the default from '*' using the setDicts method:

    $dict->setDicts('wn', 'web1913');

Any subsequent calls to define or match will refer to these databases, unless over-ridden with additional arguments to the method. You can find out what databases are available on a server using the dbs method:

    %dbhash = $dict->dbs();

Each entry in the returned hash has the name of a database as the key, and the corresponding title as the value.

There is another special database name - ! - which says that all databases should be searched, but as soon as a definition is found, no further databases should be searched.

CONSTRUCTOR

    $dict = Net::Dict->new (HOST [,OPTIONS]);

This is the constructor for a new Net::Dict object. HOST is the name of the remote host on which a Dict server is running. This is required, and must be an explicit host name.

The constructor makes a connection to the remote DICT server, and sends the CLIENT command, to identify the client to the server.

Note: previous versions let you give an empty string for the hostname, resulting in selection of default hosts. This behaviour is no longer supported.

OPTIONS are passed in a hash like fashion, using key and value pairs. Possible options are:

Port

The port number to connect to on the remote machine for the Dict connection (a default port number is 2628, according to RFC2229).

Client

The string to send as the CLIENT identifier. If not set, then a default identifier for Net::Dict is sent.

Timeout

Sets the timeout for the connection, in seconds. Defaults to 120.

Debug

The debug level - a non-zero value will resulting in debugging information being generated, particularly when errors occur. Can be changed later using the debug method, which is inherited from Net::Cmd. More on the debug method can be found in Net::Cmd.

Making everything explicit, here's how you might call the constructor in your client:

    $dict = Net::Dict->new($HOST,
                           Port    => 2628,
                           Client  => "myclient v$VERSION",
                           Timeout => 120,
                           Debug   => 0);

This will return undef if we failed to make the connection. It will die if bad arguments are passed: no hostname, unknown argument, etc.

METHODS

Unless otherwise stated all methods return either a true or false value, with true meaning that the operation was a success. When a method states that it returns a value, failure will be returned as undef or an empty list.

define ( $word [, @dbs] )

returns a reference to an array, whose members are lists, consisting of two elements: the dictionary name and the definition. If no dictionaries are specified, those set by setDicts() are used.

match ( $pattern, $strategy [, @dbs] )

Looks for words which match $pattern according to the specified matching $strategy. Returns a reference to an array, each entry of which is a reference to a two-element array: database name, matching word.

dbs

Returns a hash with information on the databases available on the DICT server. The keys are the short names, or identifiers, of the databases; the value is title of the database:

    %dbhash = $dict->dbs();
    print "Available dictionaries:\n";
    while (($db, $title) = each %dbhash)
    {
        print "$db : $title\n";
    }

This is the SHOW DATABASES command from RFC 2229.

dbInfo ( $dbname )

Returns a string, containing description of the dictionary $dbname.

setDicts ( @dicts )

Specify the dictionaries that will be searched during the successive define() or match() calls. Defaults to '*'. No existance checks are performed by this interface, so you'd better make sure the dictionaries you specify are on the server (e.g. by calling dbs()).

strategies

returns an array, containing an ID of a matching strategy as a key and a verbose description as a value.

This method was previously called strats(); that name for the method is also currently supported, for backwards compatibility.

auth ( $USER, $PASSPHRASE )

Attempt to authenticate the specified user, using the scheme described on page 18 of RFC 2229. The user should be known to the server, and $PASSPHRASE is a shared secret known only to the server and the user.

For example, if you were using dictd from dict.org, your configuration file might include the following:

    database private {
        data "/usr/local/dictd/db/private.dict.dz"
        index "/usr/local/dictd/db/private.index"
        access { user connor }
    }

    user connor "there can be only one"

To be able to access this database, you'd write something like the following:

    $dict = Net::Dict->new('dict.foobar.com');
    $dict->auth('connor', 'there can be only one');

A subsequent call to the databases method would reveal the private database now accessible. Not all servers support the AUTH extension; you can check this with the has_capability() method, described below.

serverInfo

Returns a string, containing the information about the server, provided by the server:

    print "Server Info:\n";
    print $dict->serverInfo(), "\n";

This is the SHOW SERVER command from RFC 2229.

dbTitle ( $DBNAME )

Returns the title string for the specified database. This is the same string returned by the dbs() method for all databases.

capabilities

Returns a list of the capabilities supported by the DICT server, as described on pages 7 and 8 of RFC 2229.

has_capability ( $cap_name )

Returns true (non-zero) if the DICT server supports the specified capability; false (zero) otherwise. Eg

    if ($dict->has_capability('auth')) {
        $dict->auth('genie', 'open sesame');
    }

status

Send the STATUS command to the DICT server, which will return some server-specific timing or debugging information. This may be useful when debugging or tuning a DICT server, but probably won't be of interest to most users.

KNOWN BUGS AND LIMITATIONS

  • Need to add methods for getting lists of databases and strategies in the order they're returned by the remote server. Suggested by Aleksey Cheusov.

  • The following DICT commands are not currently supported:

        OPTION MIME
  • No support for firewalls at the moment.

  • Site-wide configuration isn't supported. Previous documentation suggested that it was.

  • Currently no way to specify that results of define and match should be in HTML. This was also previously a config option for the constructor, but it didn't do anything.

EXAMPLES

The distribution includes two example DICT clients: dict is a basic command-line client, and tkdict is a GUI-based client, created using Perl/Tk.

The examples directory of the Net-Dict distribution includes two basic examples. simple.pl illustrates basic use of the module, and portuguese.pl demos use of an English to Portuguese dictionary. Thanks to Jose Joao Dias de Almeida for the examples.

SEE ALSO

RFC 2229

The internet document which defines the DICT protocol.

http://www.cis.ohio-state.edu/htbin/rfc/rfc2229.html

Net::Cmd

A module which provides methods for a network command class, such as Net::FTP, Net::SMTP, as well as Net::Dict. Part of the libnet distribution, available from CPAN.

Digest::MD5

You'll need this module if you want to use the auth method.

dictd

The reference DICT server, available from dict.org. Also includes a sample dict client, in C.

http://www.dict.org/

The home page for the DICT effort; has links to other resources, including other libraries and clients.

AUTHOR

The first version of Net::Dict was written by Dmitry Rubinstein <dimrub@wisdom.weizmann.ac.il>, using Net::FTP and Net::SMTP as a pattern and a model for imitation.

The module was extended, and is now maintained, by Neil Bowers <neil@bowers.com>

COPYRIGHT

Copyright (C) 2002-2003 Neil Bowers. All rights reserved.

Copyright (C) 2001 Canon Research Centre Europe, Ltd.

Copyright (c) 1998 Dmitry Rubinstein. All rights reserved.

This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself.