The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Math::String::Charset::Grouped - A charset of simple charsets for Math::String objects.

SYNOPSIS

    use Math::String::Charset::Grouped;

REQUIRES

perl5.005, Exporter, Math::BigInt, Math::String::Charset

EXPORTS

Exports nothing.

DESCRIPTION

This module lets you create an charset object, which is used to construct Math::String objects.

This object can assign for each position in a Math::String a different simple charset (aka a Math::String::Charset object of order => 1, type => 0).

Default charset

The default charset is the set containing "abcdefghijklmnopqrstuvwxyz" (thus producing always lower case output).

ERORRS

Upon error, the field _error stores the error message, then die() is called with this message. If you do not want the program to die (f.i. to catch the errors), then use the following:

        use Math::String::Charset::Grouped;

        $Math::String::Charset::Grouped::die_on_error = 0;

        $a = new Math::String::Charset::Grouped ();     # error, empty set!
        print $a->error(),"\n";

INTERNAL DETAILS

This object caches certain calculation results (f.i. the number of possible combinations for a certain string length), thus greatly speeding up sequentiell Math::String conversations from string to number, and vice versa.

METHODS

new()

            new();

Create a new Math::Charset::Grouped object.

The constructor takes a HASH reference. The following keys can be used:

        minlen          Minimum string length, -inf if not defined
        maxlen          Maximum string length, +inf if not defined
        sets            hash, table with charsets for the different places
        start           array ref to list of all valid (starting) characters
        end             array ref to list of all valid ending characters
        sep             separator character, none if undef

start and end are synomyms for sets-{1}> and sets-{-1}>, respectively. The will override what you specify in sets and are only for convienence.

The resulting charset will always be of order 1, type 1.

start

start contains an array reference to all valid starting characters, e.g. no valid string can start with a character not listed here.

The same can be acomplished by specifying sets-{1}>.

sets

sets contains a hash reference, each key of the hash indicates an index. Each of the hash entries MUST point either to an ARRAY reference or a Math::String::Charset of order 1, type 0.

Positive indices (greater than one) count from the left side, negative from the right. 0 denotes the default charset to be used for unspecified places.

The index count will be used for all string length, so that sets-{2}> always refers to the second character from the left, no matter how many characters the string actually has.

At each of the position indexed by a key, the appropriate charset will be used.

Example for specifying that strings must start with upper case letters, followed by lower case letters and can end in either a lower case letter or a number:

        sets => {
          0 => ['a'..'z'],              # the default
          1 => ['A'..'Z'],              # first character is always A..Z
         -1 => ['a'..'z','0'..'9'],     # last is q..z,0..9
        }

In case of overlapping, a cross between the two charsets will be used, that contains all characters from both of them. The default charset will only be used when none of the charsets counting from left or right matches.

Given the definition above, valid strings with length 1 consist of:

        ['A'..'Z','0'..'9']

Imagine having specified a set at position 2, too:

        sets => {
          0 => ['a'..'z'],              # the default
          1 => ['A'..'Z'],              # first character is always A..Z
          2 => ['-','+','2'],           # second character is - or +
         -1 => ['a'..'z','0'..'9'],     # last is q..z,0..9
        }

For strings of length one, this character set will not be used. For strings with length 2 it will be crossed with the set at -1, so that the two-character long strings will start with ['A'..'Z'] and end in the characters ['-','+','2','0','1','3'..'9'].

The cross is build from left to right, that is first come all characters that are in the set counting from left, and then all characters in the set counting from right, except the ones that are in both (since no doubles must be used).

end

end contains an array reference to all valid ending characters, e.g. no valid string can end with a character not listed here. Note that strings of length 1 start and end with their only character, so the character must be listed in end and start to produce a string with one character. The same can be acomplished by specifying sets-{-1}>.

minlen

Optional minimum string length. Any string shorter than this will be invalid. Must be shorter than a (possible defined) maxlen. If not given is set to -inf. Note that the minlen might be adjusted to a greater number, if it is set to 1 or greater, but there are not valid strings with 2,3 etc. In this case the minlen will be set to the first non-empty class of the charset.

maxlen

Optional maximum string length. Any string longer than this will be invalid. Must be longer than a (possible defined) minlen. If not given is set to +inf.

minlen()

        $charset->minlen();

Return minimum string length.

maxlen()

        $charset->maxlen();

Return maximum string length.

length()

        $charset->length();

Return the number of items in the charset, for higher order charsets the number of valid 1-character long strings. Shortcut for $charset->class(1).

count()

Returns the count of all possible strings described by the charset as a positive BigInt. Returns 'inf' if no maxlen is defined, because there should be no upper bound on how many strings are possible.

If maxlen is defined, forces a calculation of all possible class() values and may therefore be very slow on the first call, it also caches possible lot's of values if maxlen is very high.

class()

        $charset->class($order);

Return the number of items in a class.

        print $charset->class(5);       # how many strings with length 5?

char()

        $charset->char($nr);

Returns the character number $nr from the set, or undef.

        print $charset->char(0);        # first char
        print $charset->char(1);        # second char
        print $charset->char(-1);       # last one

lowest()

        $charset->lowest($length);

Return the number of the first string of length $length. This is equivalent to (but much faster):

        $str = $charset->first($length);
        $number = $charset->str2num($str);

highest()

        $charset->highest($length);

Return the number of the last string of length $length. This is equivalent to (but much faster):

        $str = $charset->first($length+1);
        $number = $charset->str2num($str);
        $number--;

order()

        $order = $charset->order();

Return the order of the charset: is always 1 for grouped charsets. See also type.

type()

        $type = $charset->type();

Return the type of the charset: is always 1 for grouped charsets. See also order.

charlen()

        $character_length = $charset->charlen();

Return the length of one character in the set. 1 or greater. All charsets used in a grouped charset must have the same length, unless you specify a seperator char.

seperator()

        $sep = $charset->seperator();

Returns the separator string, or undefined if none is used.

chars()

        $chars = $charset->chars( $bigint );

Returns the number of characters that the string would have, when you would convert $bigint (Math::BigInt or Math::String object) back to a string. This is much faster than doing

        $chars = length ("$math_string");

since it does not need to actually construct the string.

first()

        $charset->first( $length );

Return the first string with a length of $length, according to the charset. See lowest() for the corrospending number.

last()

        $charset->last( $length );

Return the last string with a length of $length, according to the charset. See highest() for the corrospending number.

is_valid()

        $charset->is_valid();

Check wether a string conforms to the charset set or not.

error()

        $charset->error();

Returns "" for no error or an error message that occured if construction of the charset failed. Set $Math::String::Charset::die_on_error to 0 to get the error message, otherwise the program will die.

start()

        $charset->start();

In list context, returns a list of all characters in the start set, that is the ones used at the first string position. In scalar context returns the lenght of the start set.

Think of the start set as the set of all characters that can start a string with one or more characters. The set for one character strings is called ones and you can access if via $charset-ones()>.

end()

        $charset->end();

In list context, returns a list of all characters in the end set, aka all characters a string can end with. In scalar context returns the lenght of the end set.

ones()

        $charset->ones();

In list context, returns a list of all strings consisting of one character. In scalar context returns the lenght of the ones set.

This list is the cross of start and end.

Think of a string of only one character as if it starts with and ends in this character at the same time.

The order of the chars in ones is the same ordering as in start.

prev()

        $string = Math::String->new( );
        $charset->prev($string);

Give the charset and a string, calculates the previous string in the sequence. This is faster than decrementing the number of the string and converting the new number to a string. This routine is mainly used internally by Math::String and updates the cache of the given Math::String.

next()

        $string = Math::String->new( );
        $charset->next($string);

Give the charset and a string, calculates the next string in the sequence. This is faster than incrementing the number of the string and converting the new number to a string. This routine is mainly used internally by Math::String and updates the cache of the given Math::String.

EXAMPLES

    use Math::String::Charset::Grouped;

    # not ready yet

BUGS

None doscovered yet.

AUTHOR

If you use this module in one of your projects, then please email me. I want to hear about how my code helps you ;)

This module is (C) Copyright by Tels http://bloodgate.com 2000-2003.