Regexp::Cherokee - Regular Expressions Support for Cherokee Script.
# # Overloading Perl REs: # use utf8; use Regexp::Cherokee qw(overload setForm); : s/([#2#])/setForm($1,6)/eg; s/([ᎠᎦᎧáŽ]%2)/setForm($1,6)/eg; s/([ᎠᎦᎧáŽ]%{1,3})/setForm($1,6)/eg; s/([ᎠᎦᎧáŽ]%{1-3,7})/setForm($1,6)/eg; s/([#Ꮎ#])/subForm('á�œ',$1)/eg; # substitute, a 'á�œ' for a 'Ꮎ' in the form found for the 'Ꮎ' if ( /[#á�Œ#]/ ) { # # do something # : } : : # # Without overloading: # use utf8; require Regexp::Cherokee; my $string = "[ᎠᎦᎧáŽ]%{1-3,7}"; my $re = Regexp::Cherokee::getRe ( $string ); s/abc($re)xyz/"abc".Regexp::Cherokee::setForm($1,6)."xyz"/eg;
The Regexp::Cherokee module provides POSIX style character class definitions for working with the Cherokee syllabary. The character classes provided by the Regexp::Cherokee package correspond to inate properties of the script and are language independent.
The Regexp::Cherokee package is NOT derived from the Regexp class and may not be instantiated into an object. Regexp::Cherokee can optionally export the utility functions getForm, setForm, subForm and formatForms (or all with the :utils pragma) to query or set the form of an Cherokee character. Tags of variables in the form names set to form values may be exported under the :forms pragma.
getForm
setForm
subForm
formatForms
:utils
:forms
See the files in the doc/ and examples/ directories that are included with this package.
A utility function to query the "form" of an Cherokee syllable. It will return an integer between 1 and 12 corresponding to the [#\d+#] classes.
print getForm ( "�" ), "\n"; # prints 1
A utility function to set the form number of a syllable. The form number must be an integer between 1 and 12 corresponding to the [#\d+#] classes.
s/(.)/setForm($1, 1)/eg;
A utility function to set the form number of a syllable based on the form of another syllable.
s/(\w+)([#Ꮎ#]/$1.subForm('�', $2)/eg;
A utility function somewhat analogous to sprintf for a sequence of syllables:
sprintf
print formatForms ( "%1%2%3%4", "ᎠᎦᎧáŽ" ), "\n"; # prints ᎠᎨᎯᎶ
The overloading mechanism only applies to the constant part of the RE. The following would not be handled by the Regexp::Ethiopic package as expected:
use Regexp::Cherokee 'overload'; my $x = "Ꭷ"; : : if ( /[#$x#]/ ) { : : }
The package never gets to see the variable $x to then perform the RE expansion. The work around is to use the package as per:
$x
use Regexp::Cherokee 'overload'; my $x = "Ꭷ"; : : my $re = Regexp::Cherokee::getRe ( "[#$x#]" ); if ( /$re/ ) { : : }
This works as expected at the cost of one extra step. The overloading and functional modes of the Regexp::Cherokee package may be used together without conflict.
Works perfectly with Perl 5.8.0, may work with Perl 5.6.x but has not yet been tested.
None presently known.
Daniel Yacob, dyacob@cpan.org
Included with this package:
examples/overload.pl examples/utils.p
1 POD Error
The following errors were encountered while parsing the POD:
Non-ASCII character seen before =encoding in 's/([ᎠᎦᎧáŽ]%2)/setForm($1,6)/eg;'. Assuming CP1252
To install Regexp::Cherokee, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Regexp::Cherokee
CPAN shell
perl -MCPAN -e shell install Regexp::Cherokee
For more information on module installation, please visit the detailed CPAN module installation guide.