The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

String::Equivalence::Amharic - Normalization Utilities for Amharic.

SYNOPSIS

  #
  #  OO Style:
  #
  use utf8;
  require String::Equivalence::Amharic;

  my $string = new String::Equivalence::Amharic;

  my @list = $string->downgrade ( "እግዚአብሔር" );

  my $count = 0;
  foreach (@list) {
      $count++;
      print "$count: $_\n";
  }


  #
  #  Functional Style:
  #
  use utf8;
  use String::Equivalence::Amharic;

  my @list = downgrade ( "እግዚአብሔር" );

  :
  :
  :

DESCRIPTION

Under the "three levels of Amharic spelling" theory, the String::Equivalence::Amharic package will take a canonical word (level one) and generate level two words (the level of popular use). The first member of the returned array is the original string. The last member of the returned array is a regular expression that will match all renderings of the list.

The doc/index.html file presents a development of the downgrade rules applied.

The package is useful for some problems, it will produce orthographically "legal" simplification and avoids improbable naive simplifications. Text::Metaphone::Amharic of course over simplifies as it addresses a different problem. So while not to promote level 2 orthographies, in some instances it is useful to generate level 2 renderings given a canonical form.

You must start with the canonical spelling of a word as only downgrades can occur. Starting with a near canonical form and downgrading will generate a shorter word list than you would have starting from the top.

Equivalence Utilities

downgrade =head3 isReducible =head3 hasEquivalence =head3 isEquivalentTo =head3 inflate

A utility function to query the "form" of an Ethiopic syllable. It will return an integer between 1 and 12 corresponding to the [#\d+#] classes.

  print getForm ( "አ" ), "\n";  # prints 1

REQUIRES

Regexp::Ethiopic (which rules btw).

COPYRIGHT

This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

BUGS

None presently known.

AUTHOR

Daniel Yacob, dyacob@cpan.org

SEE ALSO

Text::Metaphone::Amharic

1 POD Error

The following errors were encountered while parsing the POD:

Around line 237:

Non-ASCII character seen before =encoding in '"እግዚአብሔር"'. Assuming UTF-8