The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Lingua::EN::Numericalize - Replaces English descriptions of numbers with numerals

SYNOPSIS

 use Lingua::EN::Numericalize;
 print str2nbr("one thousand maniacs");

 $_ = "six hundred three-score and six";
 str2nbr();
 print;

 $Lingua::EN::Numericalize::UK = 1;
 print str2nbr("one billion");      # 1,000,000,000,000

DESCRIPTION

This module interpolates English descriptions of numbers in a given string with their numeric counterparts. It supports both ordinal and cardinal numbers, negative numbers, and very large numbers.

The module exports a single function into the caller's namespace as follows:

str2nbr [string = $_]

This function receives an optional string (using $_ if none is passed) and converts all English text that describes a number into its numeric equivalent. When called in a void context, the function sets $_ to the new value.

The module's behaviour is affected by the following variables:

$Lingua::EN::Numericalize::UK

This variable may be set to indicate that the UK meaning of billion should be used. By default, this module uses the American meaning of this word :( Please note that all the related larger numbers e.g. trillion, quadrillion, etc. assume the chosen behaviour as well.

$Lingua::EN::Numericalize::debug

If set to true, the module outputs on standard error messages useful for debugging.

NOTES

Scores are supported, e.g. "three score and six", so are dozens, baker's dozens and grosses.

Cardinal numbers become ordinal i.e. second => 2, 13th => 13.

Various mispellings are understood, as are plurals, "illions" (e.g. million, billion, etc.), and "illiards" (e.g. milliard, billiard, etc.) in addition to suffixes e.g. 1k => 1000, 2M, 3B. Extended hundreds are also supported e.g. twelve hundred = one thousand two hundred = 1200.

While it handles googol correctly, googolplex is too large to fit in perl's standard scalar type, and "inf" will be returned.

TODO/BUGS

1) currently chops off plurals and other suffixes from words that are not numbers. This needs to be fixed since no words here produces no word here and hell hath no fury to hell ha no fury.
2) would be nice to handle fractions
3) spelled out number e.g. nine one one = 911 (not 11: 9+1+1)
4) runnin' => r9 - yikes!

Any suggestions are welcome.

AUTHOR

Erick Calder <ecalder@cpan.org>

ACKNOWLEDGEMENTS

This module was inspired by Joey Hess' Words2Nums but is a complete rewrite with an entirely different internal approach. It differs from his module in that it is smart enough to ignore strings it doesn't recognise, thus preempting the impossible requirement that the user first parse the string. As an example, a string like One Thousand Maniacs would fail if passed to Words2Nums (since it contains Maniacs) and doing a split and passing each individual piece would yield 1 1000 maniacs instead of the desired 1000 maniacs.

SUPPORT

For help and thank you notes, e-mail the author directly. To report a bug, submit a patch or add to our wishlist please visit the CPAN bug manager at: http://rt.cpan.org

AVAILABILITY

The latest version of the tarball, RPM and SRPM may always be found at: http://perl.arix.com/ Additionally the module is available from CPAN.

LICENCE AND COPYRIGHT

This utility is free and distributed under GPL, the Gnu Public License. A copy of this license was included in a file called LICENSE. If for some reason, this file was not included, please see http://www.gnu.org/licenses/ to obtain a copy of this license.

$Id: Numericalize.pm,v 1.52 2003/02/17 23:51:40 ekkis Exp $