This PDD describes Parrot's numeric data types.
This PDD details the basic numeric datatypes that the Parrot core knows how to deal with, including the core numeric PMCs.
Parrot provides a native integer data type,
generally known as an "Int".
The size of the integer is chosen at Parrot configuration time,
the same size as platform-native integers.
In C,
the typedefs INTVAL
and UINTVAL
are native signed and unsigned integers respectively.
The semantics of native integer data types are the same as the semantics of their C equivalents.
Integer data types have a dedicated register set.
In PIR,
the I
register variables ($I0
,
etc.) and .param
s or .local
s declared with the int
type are native integers.
Native unsigned integers are not accessible directly in PIR.
Many opcodes or vtable functions are defined with variants that take native integer arguments.
When passed to a subroutine or method call,
a native integer may be autoboxed as an Integer
PMC,
or as an HLL type mapped to Integer
.
Parrot provides a native floating-point data type,
generally known as a "Num".
The size of the float is chosen at Parrot configuration time,
the same size as platform-native floats.
In C,
the typedef FLOATVAL
is a native float data type.
The semantics of the native float data type are the same as the semantics of the C equivalent.
Float data types have a dedicated register set.
In PIR,
the N
register variables ($N0
,
etc.) and .param
s or .local
s declared with the num
type are native floats.
Many opcodes or vtable functions are defined with variants that take native float arguments.
When passed to a subroutine or method call,
a native float may be autoboxed as a Float
PMC,
or as an HLL type mapped to Float
.
The Integer
PMC is a high-level integer type,
providing the features of a integer data type appropriate for use in a high-level language.
Some languages may be able to use Parrot's Integer
directly as their integer data type.
Others may subclass Integer
to add their own functionality,
and others may implement their own high-level integer data type.
The Integer
PMC has a single attribute,
the integer value.
init()
Initializes the Integer
to 0.
set_pmc(PMC *value)
and set_integer_same(PMC *value)
Sets the Integer
to the integer value of the PMC argument.
set_integer_native(INTVAL value)
Set the Integer
to the passed-in integer value.
set_number_native(FLOATVAL value)
,
set_bool(INTVAL value)
,
set_bigint_int(INTVAL value)
,
set_string_native(STRING *value)
Morphs the Integer
PMC to a Float
,
Boolean
,
BigInt
,
or String
PMC,
and sets the value from the passed in value.
{{NOTE: the morphing behavior is currently under consideration and may be rejected.}}
get_integer()
Retrieves the integer value of the Integer
.
get_bool()
Returns the boolean value of the Integer
(false if 0,
true otherwise).
get_number()
Returns the integer value of the Integer
as a floating-point number.
get_bigint()
Returns the integer value of the Integer
in a new BigInt
PMC.
{{ NOTE: this vtable entry may be deprecated }}
get_string()
and get_repr()
Returns the integer value of the Integer
as a string.
[add|subtract|multiply|divide|floor_divide|modulus|pow]_int(INTVAL b,
PMC *dest)
Adds/subtracts/multiplies/divides/moduluses/exponents an integer value with the Integer
PMC,
and returns the result as a new PMC.
(The dest
parameter is unused).
Overflow of the native integer storage auto-promotes the result PMC to a BigInt
.
Note that these are multidispatched.
i_[add|subtract|multiply|divide|floor_divide|modulus|pow]_int(INTVAL b)
Adds/subtracts/multiplies/divides/moduluses/exponents an integer value with the Integer
PMC,
and sets the Integer
to the resulting value.
Overflow of the native integer storage auto-promotes the Integer
to a BigInt
.
Note that these are multidispatched.
{{NOTE: there is some discussion of having this promotion of storage happen purely internally (perhaps by swapping vtables), rather than converting to a different PMC type.}}
i_[add|subtract|multiply|divide|floor_divide|modulus|pow]_float(INTVAL b)
Add/subtract/multiply/divide/modulus/exponent an integer value with the Integer
PMC,
and set the Integer
to the resulting value,
morphing it to a Float
.
Note that these are multidispatched.
increment()
Adds 1 to the value of the integer.
This may autopromote the PMC to a BigInt
.
decrement()
Subtracts 1 from the value of the integer.
This may autopromote the PMC to a BigInt
.
absolute()
Returns an Integer
PMC set to the absolute value of the current Integer
.
i_absolute()
Sets the Integer
to the absolute value of itself.
freeze()
Freezes the Integer
PMC for storage.
thaw()
Thaws the Integer
PMC from storage.
Many of the math vtable functions are defined as multiple dispatch functions.
[add|subtract|multiply|divide|floor_divide|modulus|pow](PMC *value,
PMC *dest)
Performs the addition/subtraction/multiplication/division/modulus/exponent operation,
and returns a new PMC containing the resulting value.
Multiple dispatch variants are defined for Integer
,
Complex
,
BigInt
,
String
,
and DEFAULT
.
Overflow of the native integer storage auto-promotes the result PMC to a BigInt
.
i_[add|subtract|multiply|divide|floor_divide|modulus|pow](PMC *value)
Performs the addition/subtraction/multiplication/division/modulus/exponent operation,
morphing the Integer
to the passed in type,
and setting it to the result.
Multiple dispatch variants are defined for Integer
,
Complex
,
BigInt
,
and DEFAULT
.
Overflow of the native integer storage auto-promotes the Integer
to a BigInt
.
is_equal(PMC *value)
Compares the Integer
to the passed in PMC,
returning true (1) if they are equal,
and false (0) otherwise.
Multiple dispatch variants are defined for BigInt
and DEFAULT
.
{{NOTE: Presumably the String
,
Integer
,
and Float
cases are all covered by DEFAULT
.}}
cmp(PMC *value)
Compares the Integer
to the passed in PMC,
returning 1 if Integer
is greater,
-1 if the PMC is greater,
and 0 if they are equal.
Multiple dispatch variants are defined for String
,
Float
,
and DEFAULT
.
{{NOTE: Presumably the Integer
and BigInt
cases are covered by DEFAULT
.}}
cmp_num(PMC *value)
Compares the Integer
to the passed in PMC,
returning 1 if Integer
is greater,
-1 if the PMC is greater,
and 0 if they are equal.
Multiple dispatch variants are defined for String
,
Float
,
and DEFAULT
.
{{NOTE: Presumably the Integer
and BigInt
cases are covered by DEFAULT
.}}
get_as_base(INTVAL base)
Converts the decimal integer to another base (anything from base 2 to base 36), returning the result as a STRING.
The bigint library provides Parrot with both a collection of (nearly) infinite precision numeric types and an implementation of an extended decimal arithmetic (EDA).
There are benefits in using the big number library to provide both values of effectively unlimited precision and a defined arithmetic, complete with rounding and exceptional conditions, for values which are otherwise easily represented using standard low-level types. Both require the same range of operations but differ in the environment under which those operations occur. The effort required to produce a library which implements a decimal arithmetic is not much greater than that needed to provide a base-2 big number library. There is a trade-off in both space and speed, but given the nature of dynamic languages, this should not present too great a burden.
The bignumber library provides the following data types to Parrot:
Whole numbers with no limits on their size.
Numbers with decimal fractional parts, again with no limit on size.
Numbers with a fixed maximum number of digits in their fractional part, again with no limit on size;. i.e BigRat.
The library implements these different forms of numbers using the same internal representation, and differentiates between them only when performing rounding operations. A number has the following abstract form:
[ sign, string of digits, exponent ]
If sign is zero, the number is positive. If equal to one, the number is negative. The number has the value:
sign, string of digits * 10 ** exponent
A big integer must always have a non-negative exponent. A big float may have any exponent, and a float with a fixed fractional part will have an exponent greater than a given (negative) number. These limits are not attached to a numeric value, but instead are enforced by giving any operation involving the numbers a context.
In general, Parrot functions will not need to care about what the bignum objects are or do. They should merely be used as arguments to big number functions. The objects will be managed by Parrot's garbage collection in a similar manner to strings.
Additionally the library provides special values which represent the result of otherwise undefined operations (division by zero, for instance). Positive and negative infinity (Inf
or +Inf
and -Inf
, respectively) and both quiet and signalling Not a Number (NaN
) are available. In general, the result of an operation with at least one argument which is NaN
will be NaN
. If the argument is a signalling NaN
, an exception will also be raised. See the EDA for full details.
All operations occur within a defined context. This tells the operations how they should treat their arguments, what sort of rounding to perform, and what to do if rounding loses information.
The context provides the environment in which an operation occurs, in particular the following options are available:
A positive precision requires the use of big floats. These cannot have more than precision digits in their coefficient before or after any operation. Arguments to operations with more than precision digits will be truncated and rounded appropriately. Results of operations will not have more than precision digits in their coefficients, with any extra digits accumulated during the calculation of the operation being truncated and rounded as required.
A precision of zero requires the use of integer operations. Arguments to operations are rounded so that they have no fractional part, and the result of all operations will be rounded to be integers.
A negative value of precision requires the use of a fixed number of fractional digits, with arguments and results being truncated after those digits.
With non-positive values of precision, the total number of digits in the coefficient is limited only by available memory.
The rounding part of the context defines the rounding algorithm to apply when truncating digits from a number's coefficient. The available rounding forms are outlined below.
The traps part of the context defines how the library raises exceptions. Seven distinct classes of error can occur. If the corresponding trap is set (enabled), the library raises an exception. Otherwise, execution continues with the exception class recorded in flags. For more details, see the extended decimal arithmetic standard.
The current context determines the numeric type during a particular operation. This makes it easy to upgrade from one numeric form to another and also allows for considerable code-reuse within the library.
The following exception classes are available:
Non-zero digits have been removed from an argument to a function during rounding before the operation.
Division by zero was attempted.
Because arguments were rounded, or because the result of an operation has lost significant digits, the result is inexact.
An invalid operation was attempted, for instance when NaN
is present as an argument to a function. This also covers recoverable errors such as 0/0, which signals Invalid Operation and can return NaN
.
The exponent of a number has overflowed.
An argument has been rounded.
The exponent of a number has underflowed.
The rounding part of the context defines the rounding algorithm to used. The following contexts are available (examples assume a precision of 5):
Any unwanted digits are simply truncated from the coefficient. This rounds towards zero.
[0, 1234567, 10] => [0, 12345, 12]
The first lost digit is examined. If this is in the range 0-4, the coefficient is truncated directly. If in the range 5-9, one is added to the final digit of the coefficient. If this leads to a coefficient with more than precision digits, the number is rounded again, removing the trailing zero. This is essentially rounding to nearest.
[0, 1234567, 10] => [0, 12346, 12] [0, 1234549, 10] => [0, 12345, 12] [0, 9999950, 10] => [0, 10000, 13]
The first lost digit is examined. If it lies in the range 0-4, the coefficient is truncated directly. If in the range 6-9, the coefficient is rounded up. If the first lost digit is equal to 5 and the remaining lost digits in the coefficient are non-zero, the number is also rounded up. If the lost digits are equal to exactly half, the number is rounded up if the least significant retained digit is odd, and rounded down if it is even.
If the digits to be discarded are non zero and the number is negative, the coefficient is rounded up, otherwise it remains the same.
This is rounding towards -Inf
.
If the digits to be discarded are non zero, and the number is positive, the coefficient is rounded up, otherwise it remains the same.
This is rounding towards Inf
.
The library provides the following operations. They function exactly as those described in the Standard Decimal Arithmetic (SDA), with some extension to cope with integer and fixed fractional part numbers. Only the deviations are outlined here.
In all cases, the sequence of rounding and promotion to zero outlined by the SDA are followed, even where the context implies integer operations.
Under integer conditions, division halts once the first fractional digit is calculated, with the result rounded to an integer and returned. Under fixed-fraction conditions, one more digit than needed is calculated, with the coefficient then rounded and returned.
If a floating point value is required, or if inexact division by a very small number is attempted, it may be wise to follow big float arithmetic to limit the number of digits returned. It is safe to chose a precision at least as large as the largest number of digits of either argument to the division function.
For both integer and fixed-fraction numbers, the result returned by the remainder function will be an integer or fixed-fraction number. The result of integer division will be an integer.
Comparison returns a big number which is equal to 1, 0, or -1 if the first argument is larger, equal to, or smaller than the second. An alternate form returns an INTVAL.
A one to one conversion between the abstract representation above and a string is provided by the library, and acts as defined by the standard decimal arithmetic. Other conversation operations may also be implemented; these may not provide one to one mapping.
A pedantic error checking conversion is available within the library, but only works with native strings. Versions which work with Parrot STRINGs will also be provided, although in a separate file to the rest of the library. (They will share a common private header file).
Functions are provided which implement the arithmetic, conversion, creation and destruction of big numbers by dealing with otherwise opaque big number objects.
A big number is represented by the following structure, capable of being allocated, tracked, and destroyed by the Parrot garbage collection system.
typedef struct { BN_NIB *buffer; /* string of nibbles */ UINTVAL nibs; /* nibs allocated, in sizeof(BN_NIB) */ UINTVAL flags; /* private flags store: 001 Inf, 010 qNAN, 110 sNAN */ INTVAL digits; /* digits used */ INTVAL expn; /* exponent of number */ int sign; /* sign of number, 0 => positive or zero, 1 => negative */ } parrot_bignum_t;
Within the library, individual decimal digits can be accessed using macros. Outside the library, access must be made via exported functions. BN_NIB is likely to be a UINTVAL, but this is not essential.
Special values are represented by setting digits to zero and setting appropriate private flags, using internal macros. Infinity has one flag field, NaN another flag field, and sNaN a third. In general the flags should not be examined directly, even within the module.
typedef struct { INTVAL precision; /* number of digs to retain */ BN_ROUNDING rounding; /* rounding type to perform */ BOOLVAL extended; /* do we use extended or base semantics? */ unsigned char flags; /* records possible errors */ unsigned char traps; /* throw errors or not? */ } parrot_bignum_context;
BN_ROUNDING is an enumeration of the possible rounding types as described earlier. traps is a bitmask of exception traps. 0 implies that a trap is disabled and 1 implies it is enabled. flags is a bitmask which records exceptional conditions and has the same fields at flags.
Language level types should implement big floats using a global floating point context available in an interpreter structure (and accessible). Big integers and fixed-fraction number are provided by creating a context with an appropriate precision whenever a call into the library is made.
When the module raises an exceptional condition, control passes to BN_nonfatal()
. this examines the error which has occurred and the current context to determine which class of error has occurred. If the corresponding trap handler is not enabled, the context's flags are updated and control is returned to the bignumber library. Otherwise the exception becomes fatal. How this mechanism interacts with Parrot's own is yet to be decided.
The possible exceptions are detailed in the extended decimal arithmetic.
The Standard Decimal Arithmetic provides a collection of tests for both its base and extended behavior.
Fill in the remaining functions from the EDA, verify that the test suite still passes, integrate the library into the rest of Parrot, provide PMC types and suitable opcodes. Conversion to and from Parrot strings, conversion to and from floating point types, sprintf output of bignumbers.
IBM's Standard Decimal Arithmetic, with tests (http://speleotrove.com/decimal/)
The Perl modules Math::BigInt and Math::BigFloat.
Alex Gough's suggestions for bigint/bignum implementation.
GNU gmp. That's we currently use: mpz and mpf.