Chas. J. Owens IV > Unicode-Digits > Unicode::Digits

Unicode-Digits-20090607.tar.gz

Dependencies

Annotate this POD

# CPAN RT

 New 1 Open 0
View/Report Bugs
Module Version: 20090607

# NAME

Unicode::Digits - Convert UNICODE digits to integers you can do math with

Version 20090607

# SYNOPSIS

So, you have matched a string with `\d` and now want to do some math. What is that you say? The number your captured plus 5 is 5? Oh, that is right \d now matches UNICODE digits not [0-9]. What to do? Well, You can just call `digits_to_int` and all of your troubles* are over!

```    use Unicode::Digits qw/digits_to_int/;

my \$string = "forty-two in Mongolian is \x{1814}\x{1812}";
my \$num = digits_to_int \$string =~ /(\d+)/;
print \$num + 5, "\n";```

# FUNCTIONS

## digits_to_int(STRING)

The digits_to_int function transliterates a string of UNICODE digit characters to a number you can do math with, non-digit characters are passed through, so `"42 is \x{1814}\x{1812}"` becomes `"42 is 42"`.

## digits_to_int(STRING, ERRORHANDLING)

You can optionally pass an argument that controls what happens when the source string contains non-digit characters or characters from different sets of digits. ERRORHANDLING can be one of `"strict"`, `"loose"`, `"looser"`, or `"loosest"`. Their behaviours are as follows:

strict

All of the characters must be digit characters and they must all come from the same range (so no mixing Monglian digits with Arabic-Indic digits) or the function will die.

loose

All of the characters must be digit characters or it will die. If there are characters from different ranges you will get a warning.

looser

If there are any non digit characters, or the characters are from different ranges, you will get a warning.

loosest

This is the default mode, all non-digit characters are passed through witout warning, and the digits do not have to come from the same range.

# AUTHOR

Chas. J. Owens IV, `<chas.owens at gmail.com>`

# DIAGNOSTICS

"wrong number of arguments"

`digits_to_int` takes one or two arguments, if you have more than two or no arguments you will recieve this error.

"ERRORHANDLING must be strict, loose, looser, or loosest not '%s'"

If you pass a second argument that is not strict, loose, looser, or loosest to `digits_to_int`, you will recieve this error.

"string '%s' contains non-digit characters"

You will recieve this message as a warning or error (depending on what mode you chose), if the string has characters that do not have the UNICODE digit property.

"string '\$s' contains digits from different ranges"

You will recieve this message as a warning or error (depending on what mode you chose), if the string has characters that are not part of the same range of digit characters.

"U+%x claims to be a digit, but doesn't have a digit number"

This error is unlikely to occur, if it does then the bug is either with my code (the likely scenario) or `Unicode::UCD` (not very likely).

# BUGS

My understanding of UNICODE is flawed, therefore, I have undoubtly done something wrong. For instance, what should be done with "5\x{0308}"? Also, there is a bunch of stuff relating to surrogates I don't understand.

# SUPPORT

You can find documentation for this module with the perldoc command.

`    perldoc Unicode::Digits`