The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.
= Laziness

A draft on the runtime view of how Lazy things (like Arrays) work.

# by fglock

The text currently has 2 main sections:
- a review of the Perl 6 Bible, with some interpretation and guessing
- implementation notes.

Each section has 3 subsections:
- Native data - about the internal representation of data types
- Values - about the basic Perl 6 "OO" things
- Containers - about the glue that hold things together in data structures

== TODO
- include drawings
- comparing with Ruby/Python should help explaining some concepts
- this text is supposed to talk about all these things:
  Scalar / Array / Hash Container
  Tuple / List / Generator / Iterator / Range
  file iterator
  Signature / Argument List
  Array / Hash slice
  enum
  tie / proxy / rw / env
  Constrained Types (eg: Int / Even / 0..100 where Even)
  Piddle
  lazy string / int / num


= Part I - Perl 6 Bible Readings

References: 
- S02 Bits and Pieces - for "native data"
- S09 Data Structures - for Object types
- A bit of each of the other AES

= Native data

Native data is built with a thin, (possibly non-OO) library over the virtual 
machine data types.

For example, you may need a wrapper to implement 1-bit values, if the 
virtual machine only provides bytes.

- native "singular" value

Native values are: num, int, uint, bit, complex, ref.
Native values can have a bit-size specification: int1, int16.

A "native container" is whatever a "ref" value points to.

A "num" value must support IEEE floating point conventions - which means there
are native "Inf" and "NaN".
Native values don't need to be able to represent "undef".

- native "plural" value

Native plural values are: array, str, hash.

A "str" is an array of "int".

There may be separate implementations for fixed-size and for 
auto-extending allocation.

These are fast, in-memory structures provided by the virtual machine.
These structures are assumed to be one-dimensional.

Native hash is mostly used internally, for holding "namespaces": Classes, variables, labels.

There may be separate implementations for string-indexed and for object-indexed hashes.

= Value Class

Values can be "singular" or "plural".

Singular Values are: Str, Int, Num, Pair, Ref, Junction, 
Code, Block, Class, Grammar, Macro ...

Plural Values are things that look like single-dimensional arrays: 
List, Generator, Iterator.

-- List, Lazy List, Tuple classes

List is implemented as a native Array, or as a contiguous Stack area if this
is supported by the virtual machine.

List is set of things.
List is a subclass of "Value".
List is one-dimensional.

A List may be built with a mix of Values and Containers.

A List might exist in 4 "states":

  @a = ( 7, 20..30 );
    - this is the Array that will be used in the examples
  @a = ( 
         ( (Cell = 7), ), 
         ( Generator 20..30) 
       );
    - this is more-or-less what it looks like internally

- unflattened List

  ( 1, 5..10, @a ) 
    - a List built from a Scalar, a Range, and an Array

- lazily flattened with "*" operator

  ( 1, 5..10, (Scalar := @a[0]), (Generator (Scalar := @a[1])..(Scalar := @a[10])) ) 
    - the Array was "flattened", but we still have generators.

- non-lazily flattened (eager) with "**" operator

  ( 1, 5, 6, 7, 8, 9, 10, (Scalar := @a[0]), (Scalar := @a[1]), (Scalar := @a[2]),  ..... )
    - everything is flattened

- fully evaluated (Tuple)

  ( 1, 5, 6, 7, 8, 9, 10, @a[0].value, @a[1].value, @a[2].value,  ..... )
    - fully evaluated and immutable

List only implements a subset of the Array API.
A List slice is a new List - elements are shallow copied.
List don't support adding, removing or substituting elements.

List has no Cells - thus you can't bind a Container to a List element or slice. ???

  $a := (1,2,3)[2];  # compile error - or is this creating a Constant ???

If a List is built with Generators, it is a Lazy List.

-- Generator / Range Classes

Generator is an object that creates Values on demand.
Generator is "functional" rather than "physical memory"

Range is a special case, in which there is a start, an end, and a step.

#At runtime, you only access Generator elements through Containers.

Generator also implements "match":

  5 ~~ 1..10

The "*" operator flattens a Generator eagerly.

TODO - detail the behaviour of "Str Range"

-- Iterator Class

TODO

  =<>

The "*" operator flattens an Iterator eagerly.

-- List Slice

A List Slice is a shallow copy of a part of a List.

A List Slice is just a List - there is not a separate Class for it.

= Container Class

Containers are the glue that hold Values together in data structures.

The container classes are: Scalar, Array, Hash.

Array, like List, is Lazy by default.
Hash and Scalar can be lazy too, but they are not Lazy by default.

== Scalar Container Class

Scalar is a "singular Value" Container.

One of the reasons to use a container is that you can easily change
it's semantics by replacing what's inside it - that is, a Container implements
a "variable":

   value '1' - always means 'one'
   
   container ('1') - it means 'one' too, but you can increment it to mean 'two'

== Array Container Class

The Array class defines all objects that implement the Array API.

Array is a subclass of "Container"
Array contains an ordered set of Cells.
Array is multi-dimensional.

Array is not meant to be subclassed - use "roles" to make different implementations.

The actual implementation depends on the "Eager" or the "Lazy" roles.
Other implementations can be specified.

--- Eager Role

An Array object that wraps a native array.

--- Lazy Role

An Array object that generates Cells on demand.
This is the default implementation.

--- Shape (Role or parameter? trait?)

--- Compact Array Role (piddle)

TODO - piddle is different from Compact

-- Container Slice

A Container Slice is a specialized Array/Hash, in which all the elements 
are bound to objects inside other Containers (Array, Hash or Scalar).

A Lazy Container Slice contains pointers to Generators that are inside other 
Containers.

Container Slice is multi-dimensional.

== Hash

The Hash class defines all objects that implement the hash API.

Hash is like an Array, in which the Cell indexes can be of any type 
(not just numbers).

Hash is multi-dimensional.


= Part II - Internals

== Generator Internals

- TODO


== Generator API

- match

This operator is used in a match statement:

    5 ~~ 1..10

- shift / pop

TODO - not sure if "Generator" is an iterator (read-only) or uses destructive 
read (shift/pop)

- sum / reverse / map / zip

- flatten

Takes all Values from the Generator, and returns an Array (possibly Native).

- elems

The number of elements

=== This part of the Generator API is only used by the runtime internals.

- head(n)

This operator is used by the Array splice operator.

Takes "n" Values from the Generator, like "shift" does. 
The result can be a Generator (when n is big), an Array (possibly Native), 
a single Value (when n == 1), or Void (when n <= 0).

TODO - not sure if "Generator" is an iterator (read-only) or uses destructive 
read (head/tail)

- tail(n)

This operator is used by the Array splice operator.

Takes "n" Values from the back of the Generator, like "pop" does. 
The result can be a Generator (when n is big), an Array (possibly Native), 
a single Value (when n == 1), or Void (when n <= 0).

TODO - check if "Generator" is an iterator (read-only) or uses destructive 
read (head/tail)

- is_contiguous

This is used for the optimization of Slice, when the sequence is used
as a slice index.

is_contiguous() is True is the generated sequence is like "2 .. 10".


= Array internals

== Array (base Class)

TODO

== Eager (Role)

An Eager Array contains a Native Array.

TODO - explain "Cell"

   Contents:  1 2 3
   Internals: [ 1, 2, 3 ] 

== Lazy (Role)

A Lazy Array contains a "specs" Array, which contains Generators, or Native Arrays.

   Contents:  1 2 3
   Internals: [ 
                [ 1, 2, 3 ] 
              ]

   Contents:  10 20 30 100..10000 50 400..900
   Internals: [
                [ 10, 20, 30 ],
                Generator( 100, 10000 ),
                [ 50 ],
                Generator( 400, 900 )
              ]


= Array API

- TODO
Array can do "zip", "map", "grep", splice, push, pop, store, fetch. 
Array can be "bound" and "tied".


== map() 

map() applies a 'Code' to an Array state:

  1:   @a = 1..100;            # 1, 2, 3, ...
  2:   @b = @a.map:{ $_ * 2 }; # 2, 4, 6, ...
  3:  
  4:   @a[2] = 0;              # 1, 0, 3, ...  - original has changed
  5:   say @b[1..3];           # 2, 4, 6, ...  - result didn't change

Line 1:

   Container: @a
   Internals: [
                Generator( 1, 100 ),
              ]

Line 2:

   Container: @_TEMP_
   Internals: [
                Generator( 1, 100 ),
              ]

   Container: @a
   Internals: [
                RefArray( @_TEMP_ ),
              ]

   Container: @b
   Internals: [
                MapArray( @_TEMP_, { $_ * 2 } ),
              ]

Line 4:

   Container: @_TEMP_
   Internals: [
                [ 1, 2 ]
                Generator( 3, 100 ),
              ]

   Container: @a
   Internals: [
                [ 1, 0 ],
                SpliceArray( @_TEMP_, 2 ),
              ]

   Container: @b
   Internals: [
                MapArray( @_TEMP_, { $_ * 2 } ),
              ]

== splice() 

This pseudo-code implements splice().
It assumes that the internal representation of Array can do .head(n) and .tail(n) - 
see "Generator" for details.

    sub splice ( @array, $offset? = 0, $length? = Inf, @list? = () ) { 
        my ( @head, @body, @tail );
        if ( $offset >= 0 ) {
            @head = @array.head( $offset );
            if ( $length >= 0 ) {
                @body = @array.head( $length );
                @tail = @array;
            }
            else {
                @tail = @array.tail( -$length );
                @body = @array;
            }
        }
        else {
            @tail = @array.tail( -$offset );
            @head = @array;
            if ( $length >= 0 ) {
                @body = $tail.head( $length );
            }
            else {
                @body = @tail;
                @tail = $body.tail( -$length );
            }
        };
        @array = ( @head, @list, @tail );
        return @body;
    }

= Hash internals

A Hash contains a Native Hash.

Hash keys can be objects, but the objects must be immutable

   Contents:  a:1 b:2 c:3
   Internals: { a:1 b:2 c:3 } 

A Hash may contain Lazy items. 
This structure contains a Native Array, which contains Generators or Native Hashes.
This structure permits near O(1) complexity for most operations.

   Contents:  a:1 b:2 c:3
   Internals: [ 
                { a:1, b:2, c:3 }
              ]

   Contents:  a:10 b:20 c:30 (1:100 .. 10000:10100) d:50 
   Internals: [
                { a:10 b:20 c:30 },
                Generator( 1:100, 10000:10100 ),
                { d:50 },
              ]


= Hash API

- TODO


= Slice API

- TODO


= Slice Internals

- TODO - hash slice / array slice


__END__