The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.
# Copyright (C) 2001-2010, Parrot Foundation.

=head1 [DRAFT] PDD 14: Numbers

=head2 Abstract

This PDD describes Parrot's numeric data types.

=head2 Description

This PDD details the basic numeric datatypes that the Parrot core knows how to
deal with, including the core numeric PMCs.

=head3 Integer data types

Parrot provides a native integer data type, generally known as an "Int". The
size of the integer is chosen at Parrot configuration time, the same size as
platform-native integers. In C, the typedefs C<INTVAL> and C<UINTVAL> are
native signed and unsigned integers respectively. The semantics of native
integer data types are the same as the semantics of their C equivalents.

Integer data types have a dedicated register set. In PIR, the C<I> register
variables (C<$I0>, etc.) and C<.param>s or C<.local>s declared with the C<int>
type are native integers. Native unsigned integers are not accessible directly
in PIR. Many opcodes or vtable functions are defined with variants that take
native integer arguments. When passed to a subroutine or method call, a native
integer may be autoboxed as an C<Integer> PMC, or as an HLL type mapped to
C<Integer>.

=head3 Floating-point data types

Parrot provides a native floating-point data type, generally known as a "Num".
The size of the float is chosen at Parrot configuration time, the same size as
platform-native floats. In C, the typedef C<FLOATVAL> is a native float data
type. The semantics of the native float data type are the same as the
semantics of the C equivalent.

Float data types have a dedicated register set. In PIR, the C<N> register
variables (C<$N0>, etc.) and C<.param>s or C<.local>s declared with the C<num>
type are native floats. Many opcodes or vtable functions are defined with
variants that take native float arguments. When passed to a subroutine or
method call, a native float may be autoboxed as a C<Float> PMC, or as an HLL
type mapped to C<Float>.

=head3 Integer PMC

The C<Integer> PMC is a high-level integer type, providing the features of a
integer data type appropriate for use in a high-level language. Some languages
may be able to use Parrot's C<Integer> directly as their integer data type.
Others may subclass C<Integer> to add their own functionality, and others may
implement their own high-level integer data type.

The C<Integer> PMC has a single attribute, the integer value.

=head4 Integer Vtable Functions

=over 4

=item C<init()>

Initializes the C<Integer> to 0.

=item C<set_pmc(PMC *value)> and C<set_integer_same(PMC *value)>

Sets the C<Integer> to the integer value of the PMC argument.

=item C<set_integer_native(INTVAL value)>

Set the C<Integer> to the passed-in integer value.

=item C<set_number_native(FLOATVAL value)>, C<set_bool(INTVAL value)>,
      C<set_bigint_int(INTVAL value)>, C<set_string_native(STRING *value)>

Morphs the C<Integer> PMC to a C<Float>, C<Boolean>, C<BigInt>, or C<String>
PMC, and sets the value from the passed in value.

{{NOTE: the morphing behavior is currently under consideration and may be
rejected.}}

=item C<get_integer()>

Retrieves the integer value of the C<Integer>.

=item C<get_bool()>

Returns the boolean value of the C<Integer> (false if 0, true otherwise).

=item C<get_number()>

Returns the integer value of the C<Integer> as a floating-point number.

=item C<get_bigint()>

Returns the integer value of the C<Integer> in a new C<BigInt> PMC.

{{ NOTE: this vtable entry may be deprecated }}

=item C<get_string()> and C<get_repr()>

Returns the integer value of the C<Integer> as a string.

=item C<[add|subtract|multiply|divide|floor_divide|modulus|pow]_int(INTVAL b,
      PMC *dest)>

Adds/subtracts/multiplies/divides/moduluses/exponents an integer value with
the C<Integer> PMC, and returns the result as a new PMC.  (The C<dest>
parameter is unused). Overflow of the native integer storage auto-promotes the
result PMC to a C<BigInt>.  Note that these are multidispatched.

=item C<i_[add|subtract|multiply|divide|floor_divide|modulus|pow]_int(INTVAL
        b)>

Adds/subtracts/multiplies/divides/moduluses/exponents an integer value with
the C<Integer> PMC, and sets the C<Integer> to the resulting value. Overflow
of the native integer storage auto-promotes the C<Integer> to a C<BigInt>.
Note that these are multidispatched.

{{NOTE: there is some discussion of having this promotion of storage happen
purely internally (perhaps by swapping vtables), rather than converting to a
different PMC type.}}

=item C<i_[add|subtract|multiply|divide|floor_divide|modulus|pow]_float(INTVAL
       b)>

Add/subtract/multiply/divide/modulus/exponent an integer value with the
C<Integer> PMC, and set the C<Integer> to the resulting value, morphing it to
a C<Float>.  Note that these are multidispatched.

=item C<increment()>

Adds 1 to the value of the integer.  This may autopromote the PMC to a
C<BigInt>.

=item C<decrement()>

Subtracts 1 from the value of the integer.  This may autopromote the PMC to a
C<BigInt>.

=item C<absolute()>

Returns an C<Integer> PMC set to the absolute value of the current C<Integer>.

=item C<i_absolute()>

Sets the C<Integer> to the absolute value of itself.

=item C<freeze()>

Freezes the C<Integer> PMC for storage.

=item C<thaw()>

Thaws the C<Integer> PMC from storage.


=back

=head4 Integer Multis

Many of the math vtable functions are defined as multiple dispatch functions.

=over 4

=item C<[add|subtract|multiply|divide|floor_divide|modulus|pow](PMC *value,
      PMC *dest)>

Performs the addition/subtraction/multiplication/division/modulus/exponent
operation, and returns a new PMC containing the resulting value. Multiple
dispatch variants are defined for C<Integer>, C<Complex>, C<BigInt>,
C<String>, and C<DEFAULT>.

Overflow of the native integer storage auto-promotes the result PMC to a
C<BigInt>.

=item C<i_[add|subtract|multiply|divide|floor_divide|modulus|pow](PMC *value)>

Performs the addition/subtraction/multiplication/division/modulus/exponent
operation, morphing the C<Integer> to the passed in type, and setting it to
the result. Multiple dispatch variants are defined for C<Integer>, C<Complex>,
C<BigInt>, and C<DEFAULT>.

Overflow of the native integer storage auto-promotes the C<Integer> to a
C<BigInt>.

=item C<is_equal(PMC *value)>

Compares the C<Integer> to the passed in PMC, returning true (1) if they are
equal, and false (0) otherwise. Multiple dispatch variants are defined for
C<BigInt> and C<DEFAULT>. {{NOTE: Presumably the C<String>, C<Integer>, and
C<Float> cases are all covered by C<DEFAULT>.}}

=item C<cmp(PMC *value)>

Compares the C<Integer> to the passed in PMC, returning 1 if C<Integer> is
greater, -1 if the PMC is greater, and 0 if they are equal. Multiple dispatch
variants are defined for C<String>, C<Float>, and C<DEFAULT>. {{NOTE:
Presumably the C<Integer> and C<BigInt> cases are covered by C<DEFAULT>.}}

=item C<cmp_num(PMC *value)>

Compares the C<Integer> to the passed in PMC, returning 1 if C<Integer> is
greater, -1 if the PMC is greater, and 0 if they are equal. Multiple dispatch
variants are defined for C<String>, C<Float>, and C<DEFAULT>. {{NOTE:
Presumably the C<Integer> and C<BigInt> cases are covered by C<DEFAULT>.}}

=back

=head4 Integer Methods

=over 4

=item C<get_as_base(INTVAL base)>

Converts the decimal integer to another base (anything from base 2 to base
36), returning the result as a STRING.

=back

=head3 Float PMC

=head3 BigInt PMC

The bigint library provides Parrot with both a collection of (nearly)
infinite precision numeric types and an implementation of an extended decimal
arithmetic (EDA).

=head3 Why decimal arithmetic?

There are benefits in using the big number library to provide both values of
effectively unlimited precision and a defined arithmetic, complete with
rounding and exceptional conditions, for values which are otherwise easily
represented using standard low-level types.  Both require the same range of
operations but differ in the environment under which those operations occur.
The effort required to produce a library which implements a decimal arithmetic
is not much greater than that needed to provide a base-2 big number library.
There is a trade-off in both space and speed, but given the nature of dynamic
languages, this should not present too great a burden.

=head3 Numeric types provided

The bignumber library provides the following data types to Parrot:

=over 4

=item Big integers (BigInt)

Whole numbers with no limits on their size.

=item Big floats (BigNum)

Numbers with decimal fractional parts, again with no limit on size.

=item Big floats with fixed fractional parts

Numbers with a fixed maximum number of digits in their fractional part, again
with no limit on size;. i.e BigRat.

=back

The library implements these different forms of numbers using the same
internal representation, and differentiates between them only when performing
rounding operations.  A number has the following abstract form:

 [ sign, string of digits, exponent ]

If sign is zero, the number is positive. If equal to one, the number is
negative.  The number has the value:

 sign, string of digits * 10 ** exponent

A big integer must always have a non-negative exponent. A big float may have
any exponent, and a float with a fixed fractional part will have an exponent
greater than a given (negative) number.  These limits are not attached to a
numeric value, but instead are enforced by giving any operation involving the
numbers a I<context>.

In general, Parrot functions will not need to care about what the bignum
objects are or do. They should merely be used as arguments to big number
functions. The objects will be managed by Parrot's garbage collection in a
similar manner to strings.

=head3 Special Values

Additionally the library provides special values which represent the result of
otherwise undefined operations (division by zero, for instance).  Positive and
negative infinity (C<Inf> or C<+Inf> and C<-Inf>, respectively) and both quiet
and signalling Not a Number (C<NaN>) are available.  In general, the result of
an operation with at least one argument which is C<NaN> will be C<NaN>. If the
argument is a signalling C<NaN>, an exception will also be raised.  See the
EDA for full details.

=head3 Context

All operations occur within a defined context.  This tells the operations how
they should treat their arguments, what sort of rounding to perform, and what
to do if rounding loses information.

The context provides the environment in which an operation occurs, in
particular the following options are available:

=over 4

=item precision

A positive I<precision> requires the use of big floats. These cannot have more
than I<precision> digits in their coefficient before or after any operation.
Arguments to operations with more than I<precision> digits will be truncated
and rounded appropriately.  Results of operations will not have more than
I<precision> digits in their coefficients, with any extra digits accumulated
during the calculation of the operation being truncated and rounded as
required.

A I<precision> of zero requires the use of integer operations.  Arguments to
operations are rounded so that they have no fractional part, and the result of
all operations will be rounded to be integers.

A negative value of I<precision> requires the use of a fixed number of
fractional digits, with arguments and results being truncated after those
digits.

With non-positive values of I<precision>, the total number of digits in the
coefficient is limited only by available memory.

=item rounding

The rounding part of the context defines the rounding algorithm to apply when
truncating digits from a number's coefficient. The available rounding forms
are outlined below.

=item traps and flags

The I<traps> part of the context defines how the library raises exceptions.
Seven distinct classes of error can occur. If the corresponding trap is set
(enabled), the library raises an exception.  Otherwise, execution continues
with the exception class recorded in flags.  For more details, see the
extended decimal arithmetic standard.

=back

The current I<context> determines the numeric type during a particular
operation. This makes it easy to upgrade from one numeric form to another and
also allows for considerable code-reuse within the library.

=head3 Exception Classes

The following exception classes are available:

=over 4

=item Lost Digits

Non-zero digits have been removed from an argument to a function during
rounding before the operation.

=item Division By Zero

Division by zero was attempted.

=item Inexact

Because arguments were rounded, or because the result of an operation has lost
significant digits, the result is inexact.

=item Invalid Operation

An invalid operation was attempted, for instance when C<NaN> is present as an
argument to a function.  This also covers recoverable errors such as 0/0,
which signals Invalid Operation and can return C<NaN>.

=item Overflow

The exponent of a number has overflowed.

=item Rounded

An argument has been rounded.

=item Underflow

The exponent of a number has underflowed.

=back

=head3 Rounding

The rounding part of the context defines the rounding algorithm to used.  The
following contexts are available (examples assume a precision of 5):

=over 4

=item Round down

Any unwanted digits are simply truncated from the coefficient.  This rounds
towards zero.

 [0, 1234567, 10] => [0, 12345, 12]

=item Round half up

The first lost digit is examined. If this is in the range 0-4, the coefficient
is truncated directly. If in the range 5-9, one is added to the final digit of
the coefficient.  If this leads to a coefficient with more than I<precision>
digits, the number is rounded again, removing the trailing zero.  This is
essentially rounding to nearest.

 [0, 1234567, 10] => [0, 12346, 12]
 [0, 1234549, 10] => [0, 12345, 12]
 [0, 9999950, 10] => [0, 10000, 13]

=item Round half even

The first lost digit is examined. If it lies in the range 0-4, the coefficient
is truncated directly. If in the range 6-9, the coefficient is rounded up.  If
the first lost digit is equal to 5 and the remaining lost digits in the
coefficient are non-zero, the number is also rounded up.  If the lost digits
are equal to exactly half, the number is rounded up if the least significant
retained digit is odd, and rounded down if it is even.

=item Round Floor

If the digits to be discarded are non zero and the number is negative, the
coefficient is rounded up, otherwise it remains the same.

This is rounding towards C<-Inf>.

=item Round Ceiling

If the digits to be discarded are non zero, and the number is positive, the
coefficient is rounded up, otherwise it remains the same.

This is rounding towards C<Inf>.

=back

=head3 Operations

The library provides the following operations. They function exactly as those
described in the Standard Decimal Arithmetic (SDA), with some extension to
cope with integer and fixed fractional part numbers.  Only the deviations are
outlined here.

In all cases, the sequence of rounding and promotion to zero outlined by the
SDA are followed, even where the context implies integer operations.

=over 4

=item Addition, Subtraction

=item Multiplication

=item Division

Under integer conditions, division halts once the first fractional digit is
calculated, with the result rounded to an integer and returned.  Under
fixed-fraction conditions, one more digit than needed is calculated, with the
coefficient then rounded and returned.

If a floating point value is required, or if inexact division by a very small
number is attempted, it may be wise to follow big float arithmetic to limit
the number of digits returned.  It is safe to chose a precision at least as
large as the largest number of digits of either argument to the division
function.

=item Integer division, Remainder

For both integer and fixed-fraction numbers, the result returned by the
remainder function will be an integer or fixed-fraction number. The result of
integer division will be an integer.

=item Rounding

=item Plus / Minus

=item Comparison

Comparison returns a big number which is equal to 1, 0, or -1 if the first
argument is larger, equal to, or smaller than the second.  An alternate form
returns an INTVAL.

=item Rescale

=item Power

=item Square Root

=back

=head3 Conversion to and from strings

A one to one conversion between the abstract representation above and a string
is provided by the library, and acts as defined by the standard decimal
arithmetic.  Other conversation operations may also be implemented; these may
not provide one to one mapping.

A pedantic error checking conversion is available within the library, but only
works with native strings.  Versions which work with Parrot STRINGs will also
be provided, although in a separate file to the rest of the library.  (They
will share a common private header file).

=head2 Implementation

Functions are provided which implement the arithmetic, conversion, creation
and destruction of big numbers by dealing with otherwise opaque big number
objects.

=head3 Big number representation

A big number is represented by the following structure, capable of being
allocated, tracked, and destroyed by the Parrot garbage collection system.

 typedef struct {
    BN_NIB *buffer; /* string of nibbles */
    UINTVAL nibs;   /* nibs allocated, in sizeof(BN_NIB) */
    UINTVAL flags;  /* private flags store: 001 Inf,  010 qNAN, 110 sNAN */
    INTVAL  digits; /* digits used */
    INTVAL  expn;   /* exponent of number */
    int     sign;   /* sign of number, 0 => positive or zero, 1 => negative */
 } parrot_bignum_t;

Within the library, individual decimal digits can be accessed using macros.
Outside the library, access must be made via exported functions.  BN_NIB is
likely to be a UINTVAL, but this is not essential.

Special values are represented by setting I<digits> to zero and setting
appropriate private I<flags>, using internal macros.  Infinity has one flag
field, NaN another flag field, and sNaN a third.  In general the flags should
not be examined directly, even within the module.

=head3 Context

 typedef struct {
    INTVAL        precision;  /* number of digs to retain */
    BN_ROUNDING   rounding;   /* rounding type to perform */
    BOOLVAL       extended;   /* do we use extended or base semantics? */
    unsigned char flags;      /* records possible errors */
    unsigned char traps;      /* throw errors or not? */
 } parrot_bignum_context;

I<BN_ROUNDING> is an enumeration of the possible rounding types as described
earlier.  I<traps> is a bitmask of exception traps. 0 implies that a trap is
disabled and 1 implies it is enabled.  I<flags> is a bitmask which records
exceptional conditions and has the same fields at I<flags>.

Language level types should implement big floats using a global floating point
context available in an interpreter structure (and accessible).  Big integers
and fixed-fraction number are provided by creating a context with an
appropriate precision whenever a call into the library is made.

=head3 Exceptional Conditions

When the module raises an exceptional condition, control passes to
C<BN_nonfatal()>. this examines the error which has occurred and the current
context to determine which class of error has occurred. If the corresponding
trap handler is not enabled, the context's flags are updated and control is
returned to the bignumber library. Otherwise the exception becomes fatal.  How
this mechanism interacts with Parrot's own is yet to be decided.

The possible exceptions are detailed in the extended decimal arithmetic.

=head2 Tests

The Standard Decimal Arithmetic provides a collection of tests for both its
base and extended behavior.

=head2 TODO

Fill in the remaining functions from the EDA, verify that the test suite still
passes, integrate the library into the rest of Parrot, provide PMC types and
suitable opcodes.  Conversion to and from Parrot strings, conversion to and
from floating point types, sprintf output of bignumbers.

=head2 References

IBM's Standard Decimal Arithmetic, with tests
(L<http://speleotrove.com/decimal/>)

The Perl modules Math::BigInt and Math::BigFloat.

Alex Gough's suggestions for bigint/bignum implementation.

GNU gmp. That's we currently use: mpz and mpf.

=cut

__END__
Local Variables:
  fill-column:78
End:
vim: expandtab tw=78 shiftwidth=4: