ዳንኤል ያዕቆብ > Text-Metaphone-Amharic-0.11 > Text::Metaphone::Amharic

Download:
Text-Metaphone-Amharic-0.11.tar.gz

Dependencies

Annotate this POD

View/Report Bugs
Module Version: 0.11   Source  

NAME ^

Text::Metaphone::Amharic - The Metaphone Algorithm for Amharic.

SYNOPSIS ^

  use utf8;
  require Text::Metaphone::Amharic;

  my $mphone = new Text::Metaphone::Amharic;

  my @keys  = $mphone->metaphone ( "ሥላሴ" );

  foreach (@keys) {
      print "$_\n";
  }

  my $key = $mphone->metaphone ( "ፀሐይ" );
  print "key => $key\n";

  $mphone->style ( "ipa" );

  @keys  = $mphone->metaphone ( "ሥላሴ" );

  foreach (@keys) {
      print "$_\n";
  }

  $mphone->style ( "ethiopic" );
    :
    :

  
  The key "style" and Metaphone "granularity" can be set at import time:

    use Text::Metaphone::Amharic ( style => "ipa", granularity => "high" );

  at instantiation time:

    my $mphone = new Text::Metaphone::Amharic ( style => "ipa", granularity => "high" );

  or anytime there after:

    $mphone->style ( "ethiopic" );
    $mphone->granularity ( "low" );

DESCRIPTION ^

The Text::Metaphone::Amharic module is a reimplementation of the Amharic Metaphone algorithm of the Text::TransMetaphone package. This implementation uses an object oriented interface and will generate keys in Ethiopic script by default (see the styles section for other encoding options).

By default the keys are generated in "low" granularity mode which finds the most matches. The granularity section discusses the effects of the different levels.

Like Text::TransMetaphone::am the terminal key returned under list context is a regular expression. Amharic character classes will be applied in the RE key as per the conventions of Regexp::Ethiopic::Amharic.

GRANULARITY

The granularity parameter refers to the degree of reduction that occurs in the key generation. The granularity modes were created for investigative purposes. The most effective "low" level mode is the default.

"high"

The least coarse grain. "ወ" and "የ" are treated under consonant rules. rules, that is stripped out of the string except as the first char. The default IM correction (shift-slip condition) folds keys both upward and downward only. The high granularity level generates the greatest number of keys. Each substitution causes a new key to be generated so that the set of keys returned represent all possible permutations. The "high" level is the least aggressive in terms of text simplification and leads to the fewest matches. The "high" level is more useful for another types of analysis, such as distance comparison to the canonical word. Since both the canonical and error words have keys folded downward for all granularity levels during IM corrections, there is no particular advantage to the "high" level for the purpose of matching.

"medium"

An in between grain. "ወ" and "የ" are treated under consonant rules. The default IM correction folds keys downward only. The keys generated represent a "lowest common denominator" that would be reducible from the "high" mode keys. More matches will be found at the lowest granularity level, but the risk of false matches becomes higher.

"low"

The default and most coarse, or aggressive, grain. "ወ" and "የ" are treated under vowel rules, that is stripped out of the string except as the first char. Like the medium level, the default IM correction folds keys downward only and the keys again are lowest common denominators of "high" mode keys. More matches will be found at the lowest granularity level, but the risk of false matches becomes higher.

STYLES

By default keys are returned with Ethiopic characters (UTF-8 encoding). If this is not your text "style" of choice, IPA symbols and SERA transliteration are also available. The text style can be set and reset at any time:

At Import Time:

  use Text::Metaphone::Amharic qw( style => "ipa" );

At Instantiation Time:

  my $mphone = new Text::Metaphone::Amharic ( style => "sera" );

After Instantiation:

  $mphone->style ( "ethio" );

A reverse method is also provided to convert an IPA or SERA symbol key into an equivalent Ethiopic sequence.

REQUIRES ^

Regexp::Ethiopic.

COPYRIGHT ^

This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

BUGS ^

None presently known.

AUTHOR ^

Daniel Yacob, dyacob@cpan.org

SEE ALSO ^

http://daniel.yacob.name/papers/DanielYacob-ICESXV.pdf
Text::TransMetaphone
Included with this package:
  examples/amphone.pl         examples/ipa-phone.pl
  examples/amphone-high.pl    examples/ipa-phone-high.pl
  examples/granularity.pl     examples/matchtest.pl
syntax highlighting: