The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Lingua::NATools::NATDict - Perl extension to encapsulate a NATools Dictionary

SYNOPSIS

  use Lingua::NATools::NATDict;

  my $dictionary = Lingua::NATools::NATDict->open("dict.ntd");

  my ($src_lng, $tgt_lng) = $dictionary->languages;

  my $word = $dictionary->word_from_id($src_lng, 2);

  my $id = $dictionary->id_from_word($src_lng, $word);

  my $count = $dictionary->word_count_by_id($tgt_lng, $wid);

  my $data = $dictionary->get_vals_by_id($tgt_lng, $wid);

  $dictionary->close;

DESCRIPTION

This module encapsulates a NATools Dictionary.

open

The basic Lingua::NATools::NATDict constructor is the open method. You must call it with the filename of the file to open. It returns the NATools Dictionary object.

close

Closes the NATools Dictionary. Current version of the C/Perl interface can handle a limited number of NATools Dictionaries opened at the same time, so to close dictionaries when they are not needed is a good practice.

languages

Returns a pair (list with two values) with the names of the languages in the corpus. You should use these strings in calls to Lingua::NATools::NATDict methods that require a language identifier.

word_from_id

This method is used to retrieve the word identified by some integer. The method is called with the language being queried and the integer identifier. It returns the word string.

id_from_word

This method is used to retrieve a word identifier. The method is called with the language being queried and the word searched. It returns the word integer identifier.

word_count_by_id

This method retrieves the occurrence count for a word in the specified language. Notice that the method is expecting a word identifier and not the proper word.

get_vals_by_id

This method retrieves the probable translations for a word in the specified language. Notice that the method is expecting a word identifier and not the proper word.

The returned object is a reference to an array with the form <(wid,prob,wid,prob,...)> where <wid> is the probable translation word identifier in the other language, and <prob> is the probability, between 0 and 1.

SEE ALSO

See perl(1) and NATools documentation.

AUTHOR

Alberto Manuel Brandao Simoes, <ambs@cpan.org>

COPYRIGHT AND LICENSE

Copyright 2002-2012 by NATURA Project http://natura.di.uminho.pt

This library is free software; you can redistribute it and/or modify it under the GNU General Public License 2, which you should find on parent directory. Distribution of this module should be done including all NATools package, with respective copyright notice.