ዳንኤል ያዕቆብ > Regexp-Cherokee-0.03 > Regexp::Cherokee

Download:
Regexp-Cherokee-0.03.tar.gz

Dependencies

Annotate this POD

View/Report Bugs
Module Version: 0.03   Source  

NAME ^

Regexp::Cherokee - Regular Expressions Support for Cherokee Script.

SYNOPSIS ^

 #
 #  Overloading Perl REs:
 #
 use utf8;
 use Regexp::Cherokee qw(overload setForm);

 :

 s/([#2#])/setForm($1,6)/eg;
 s/([ᎠᎦᎧᎭ]%2)/setForm($1,6)/eg;
 s/([ᎠᎦᎧᎭ]%{1,3})/setForm($1,6)/eg;
 s/([ᎠᎦᎧᎭ]%{1-3,7})/setForm($1,6)/eg;
 s/([#Ꮎ#])/subForm('Ꮬ',$1)/eg;  # substitute, a 'Ꮬ' for a 'Ꮎ' in the form found for the 'Ꮎ'

 if ( /[#Ꮜ#]/ ) {
   #
   # do something
   #
   :
 }

 :
 :

 #
 #  Without overloading:
 #
 use utf8;
 require Regexp::Cherokee;

 my $string = "[ᎠᎦᎧᎭ]%{1-3,7}";
 my $re = Regexp::Cherokee::getRe ( $string );

 s/abc($re)xyz/"abc".Regexp::Cherokee::setForm($1,6)."xyz"/eg;

DESCRIPTION ^

The Regexp::Cherokee module provides POSIX style character class definitions for working with the Cherokee syllabary. The character classes provided by the Regexp::Cherokee package correspond to inate properties of the script and are language independent.

The Regexp::Cherokee package is NOT derived from the Regexp class and may not be instantiated into an object. Regexp::Cherokee can optionally export the utility functions getForm, setForm, subForm and formatForms (or all with the :utils pragma) to query or set the form of an Cherokee character. Tags of variables in the form names set to form values may be exported under the :forms pragma.

See the files in the doc/ and examples/ directories that are included with this package.

Substituion Utilities

getForm

A utility function to query the "form" of an Cherokee syllable. It will return an integer between 1 and 12 corresponding to the [#\d+#] classes.

  print getForm ( "Ꮿ" ), "\n";  # prints 1

setForm

A utility function to set the form number of a syllable. The form number must be an integer between 1 and 12 corresponding to the [#\d+#] classes.

  s/(.)/setForm($1, 1)/eg;

subForm

A utility function to set the form number of a syllable based on the form of another syllable.

  s/(\w+)([#Ꮎ#]/$1.subForm('Ꮬ', $2)/eg;

formatForms

A utility function somewhat analogous to sprintf for a sequence of syllables:

  print formatForms ( "%1%2%3%4", "ᎠᎦᎧᎭ" ), "\n";  # prints ᎠᎨᎯᎶ

LIMITATIONS ^

The overloading mechanism only applies to the constant part of the RE. The following would not be handled by the Regexp::Ethiopic package as expected:

  use Regexp::Cherokee 'overload';

  my $x = "Ꭷ";
        :
        :
  if ( /[#$x#]/ ) {
        :
        :
  }

The package never gets to see the variable $x to then perform the RE expansion. The work around is to use the package as per:

  use Regexp::Cherokee 'overload';

  my $x = "Ꭷ";
        :
        :
  my $re = Regexp::Cherokee::getRe ( "[#$x#]" );

  if ( /$re/ ) {
        :
        :
  }

This works as expected at the cost of one extra step. The overloading and functional modes of the Regexp::Cherokee package may be used together without conflict.

REQUIRES ^

Works perfectly with Perl 5.8.0, may work with Perl 5.6.x but has not yet been tested.

BUGS ^

None presently known.

AUTHOR ^

Daniel Yacob, dyacob@cpan.org

SEE ALSO ^

Included with this package:

  examples/overload.pl    examples/utils.p
syntax highlighting: