The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

String::Checker - An extensible string validation module (allowing commonly used checks on strings to be called more concisely and consistently).

SYNOPSIS

 use String::Checker;

 String::Checker::register_check($checkname, \&sub);
 $return = String::Checker::checkstring($string, [ expectation, ... ]);

DESCRIPTION

This is a very simple library for checking a string against a given set of expectations. It contains a number of pre-defined expectations which can be used, and can also be extended to perform any arbitrary match or modification on a string.

Why is this useful? If you're only checking one string, it probably isn't. However, if you're checking a bunch of strings (say, for example, CGI input parameters) against a set of expectations, this comes in pretty handy. As a matter of fact, the CGI::ArgChecker module is a simple, CGI.pm aware wrapper for this library.

Checking a string

The checkstring function takes a string scalar and a reference to a list of 'expectations' as arguments, and outputs a reference to a list, containing the names of the expectations which failed.

Each expectation, in turn, can either be a string scalar (the name of the expectation) or a two-element array reference (the first element being the name of the expectation, and second element being the argument to that expectation.) For example:

   $string = "foo";
   String::Checker::checkstring($string, [ 'allow_empty',
                                           [ 'max' => 20 ] ] );

Note that the expectations are run in order. In the above case, for example, the 'allow_empty' expectation would be checked first, followed by the 'max' expectation with an argument of 20.

Defined checks

The module predefines a number of checks. They are:

allow_empty

Never fails - will convert an undef scalar to an empty string, though.

disallow_empty

Fails if the input string is either undef or empty.

min

Fails if the length of the input string is less than the numeric value of it's single argument.

max

Fails if the length of the input string is more than the numeric value of it's single argument.

want_int

Fails if the input string does not solely consist of numeric characters.

want_float

Fails if the argument does not solely consist of numeric characters, plus an optional single '.'.

allow_chars

Fails if the input string contains characters other than those in its argument.

disallow_chars

Fails if the input string contains any of the characters in its argument.

upcase

Never fails - converts the string to upper case.

downcase

Never fails - converts the string to lower case.

stripxws

Never fails - strips leading and trailing whitespace from the string.

enum

Fails if the input string does not precisely match at least one of the elements of the array reference it takes as an argument.

match

Fails if the input string does not match the regular expression it takes as an argument.

want_email

Fails if the input string does not match the regular expression: ^\S+\@@[\w-]+\.[\w\.-]+$

want_phone

Fails if the input string does not match the regular expression ^[0-9+.()-]*$

want_date

Interprets the input string as a date, if possible. This will fail if it can't figure out a date from the input. In addition, it is possible to use this to standardize date input. Pass a formatting string (see the strftime(3) man page) as an argument to this check, and the string will be formatted appropriately if possible. This is based on the Date::Manip(1) module, so that documentation might prove valuable if you're using this check.

Extension checks

Use register_check to register a new expectation checking routine. This function should be passed a new expectation name and a code reference.

This code reference will be called every time the expectation name is seen, with either one or two arguments. The first argument will always be a reference to the input string (the function is free to modify the value of the string). The second argument, if any, is the second element of a two-part expectation, whatever that might be.

The function should return undef unless there's a problem, in which case it should return 1. It's also best (if possible) to return undef if the string is undef, so that the user can decide whether to allow_empty or disallow_empty independent of your check.

For example, registering a check to verify that the input word is "poot" would look like:

   String::Checker::register_check("ispoot", sub {
       my($s) = shift;
       if ((defined($$s)) && ($$s ne 'poot')) {
           return 1;
       }
       return undef;
   };

BUGS

Hopefully none.

AUTHOR

J. David Lowe, dlowe@webjuice.com

SEE ALSO

perl(1), CGI::ArgChecker(1)