The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

=head1 NAME

PDL::QuickStart - Quick introduction to PDL features.

=head1 SYNOPSIS

A brief summary of the main PDL features and how to use them.

=head1 DESCRIPTION

=head2 Introduction

Perl is an extremely good and versatile scripting language, well suited to
beginners and allows rapid prototyping. However until recently it did not
support data structures which allowed it to do fast number crunching.

However with the development of Perl v5, Perl acquired 'Objects'. To put
it simply users can define their own special data types, and write
custom routines to manipulate them either in low level languages (C and
Fortran) or in Perl itself.

This has been fully exploited by the PerlDL developers. The 'PDL' module is a
complete Object-Oriented extension to Perl (although you don't have to know
what an object is to use it) which allows large N-dimensional data sets, such
as large images, spectra, time series, etc to be stored  B<efficiently> and
manipulated B<en masse>.  For example  with the PDL module we can write the
Perl code C<$a = $b + $c>, where C<$b> and C<$c> are large datasets
(e.g. 2048x2048 images), and get the result in only a fraction of a second.

PDL variables (or 'piddles' as they have come to be known)
support a wide range of fundamental data types - arrays can be bytes,
short integers (signed or unsigned), long integers, floats or
double precision floats. And because of the Object-Oriented nature
of PDL new customised datatypes can be derived from them.

As well as the PDL modules, that can be used by normal Perl programs, PerlDL
comes with a command line Perl shell, called 'perldl', which supports command
line editing. In combination with the various PDL graphics modules this allows
data to be easily played with and visualised.

=head2 Help

PDL contains extensive documentation, available both within the
I<perldl> or I<pdl2> shells and from the command line, using the C<pdldoc> program.
For further information try either of:

 pdl> help help
 $ pdldoc

HTML copies of the documentation should also be available. 
To find their location, try the following:

 pdl> foreach ( map{"$_/PDL/HtmlDocs"}@INC ) { p "$_\n" if -d $_ }  

=head2 Perl Datatypes and how PDL extends them

The fundamental Perl data structures are scalar variables, e.g. C<$x>,
which can hold numbers or strings, lists or arrays of scalars, e.g. C<@x>,
and associative arrays/hashes of scalars, e.g. C<%x>.

Perl v5 introduces to Perl data structures and objects. A simple
scalar variable C<$x> now be a user-defined data type or full blown
object (it actually holds a reference (a smart "pointer") to this
but that is not relevant for ordinary use of perlDL)

The fundamental idea behind perlDL is to allow C<$x> to hold a whole 1D
spectrum, or a 2D image, a 3D data cube, and so on up to large
N-dimensional data sets. These can be manipulated all at once, e.g.
C<$a = $b + 2> does a vector operation on each value in the
spectrum/image/etc.

You may well ask: "Why not just store a spectrum as a simple Perl C<@x>
style list with each pixel being a list item?"  The two key answers to
this are I<memory> and I<speed>.  Because we know our spectrum consists of
pure numbers we can compactly store them in a single block of memory
corresponding to a C style numeric array. This takes up a LOT less
memory than the equivalent Perl list. It is then easy to pass this
block of memory to a fast addition routine, or to any other C function
which deals with arrays.  As a result perlDL is very fast --- for example
one can multiply a 2048*2048 image in exactly the same time as it
would take in C or FORTRAN (0.1 sec on my SPARC). A further advantage
of this is that for simple operations (e.g. C<$x += 2>) one can manipulate
the whole array without caring about its dimensionality.

I find when using perlDL it is most useful to think of standard Perl
C<@x> variables as "lists" of generic "things" and PDL variables like
C<$x> as "arrays" which can be contained in lists or hashes. Quite
often in my perlDL scripts I have C<@x> contain a list of spectra, or a
list of images (or even a mix!). Or perhaps one could have a hash
(e.g.  C<%x>) of images... the only limit is memory!

perlDL variables support a range of data types - arrays can be bytes,
short integers (signed or unsigned), long integers, floats or
double precision floats.

=head2 Usage

PerlDL is loaded into your Perl script using this command:

 use PDL;  # in Perl scripts: use the standard perlDL modules

There are also a lot of extension modules, e.g. 
L<PDL::Graphics::TriD|PDL::Graphics::TriD>. 
Most of these (but not all as sometimes it is not appropriate) follow
a standard convention. If you say:

 use PDL::Graphics::TriD;

You import everything in a standard list from the module. Sometimes
you might want to import nothing (e.g. if you want to use OO syntax
all the time and save the import tax). For these you say:

 use PDL::Graphics::TriD qw();

And the empty C<qw()>  quotes are recognised as meaning 'nothing'.
You can also specify a list of functions to import in the normal
Perl way.

There is also an interactive shell, C<perldl> or C<pdl2>, see I<perldl>
or L<pdl2|PDL::Perldl2> for details.

=head2 To create a new PDL variable

Here are some ways of creating a PDL variable:

 $a = pdl [1..10];             # 1D array
 $a = pdl (1,2,3,4);           # Ditto
 $a = pdl '[1 2 3 4]';         # Ditto
 $b = pdl [[1,2,3],[4,5,6]];   # 2D 3x2 array
 $b = pdl '[1 2 3; 4 5 6]';    # Ditto
 $b = pdl q[1,2,3; 4,5,6];     # Ditto
 $b = pdl <<NEWPDL             # Ditto
   [1 2 3]
   [4 5 6]
 NEWPDL
 $c = pdl q[1 -2];             # 2-element piddle containing 1 and -2
 $c = pdl q[1 - 2];            # 2-element piddle containing 1 and -2
 $b = pdl 42                   # 0-dimensional scalar
 $c = pdl $a;                  # Make a new copy
 
 $d = byte [1..10];            # See "Type conversion"
 $e = zeroes(3,2,4);           # 3x2x4 zero-filled array
 
 $c = rfits $file;             # Read FITS file
 
 @x = ( pdl(42), zeroes(3,2,4), rfits($file) ); # Is a LIST of PDL variables!

The L<pdl()|PDL::Core/pdl> function is used to initialise a PDL variable from a scalar,
list, list reference, another PDL variable, or a properly formatted string.

In addition all PDL functions automatically convert normal Perl scalars
to PDL variables on-the-fly.

(also see "Type Conversion" and "Input/Output" sections below)

=head2 Arithmetic (and boolean expressions)

 $a = $b + 2; $a++; $a = $b / $c; # Etc.

 $c=sqrt($a); $d = log10($b+100); # Etc

 $e = $a>42; # Vector conditional

 $e = 42*($a>42) + $a*($a<=42); # Cap top

 $b = $a->log10 unless any ($a <= 0); # avoid floating point error

 $a = $a / ( max($a) - min($a) );

 $f = where($a, $a > 10); # where returns a piddle of elements for
                          # which the condition is true

 print $a; # $a in string context prints it in a N-dimensional format

(and other Perl operators/functions)

When using piddles in conditional expressions (i.e. C<if>, C<unless> and
C<while> constructs) only piddles with exactly one element are allowed, e.g.

 $a = pdl (1,0,0,1);
 print "is set" if $a->index(2);

Note that the boolean operators return in general multi-element
piddles. Therefore, the following will raise an error

 print "is ok" if $a > 3;

since C<$a E<gt> 3> is a piddle with 4 elements. Rather use 
L<all|PDL::Ufunc/all> or L<any|PDL::Ufunc/any>
to test if all or any of the elements fulfill the condition:

 print "some are > 3" if any $a>3;
 print "can't take logarithm" unless all $a>0;

There are also many predefined functions, which are described on other
man pages. Check L<PDL::Index>.

=head2 Matrix functions

C<'x'> is hijacked as the matrix multiplication operator. e.g.
C<$c = $a x $b>;

perlDL is row-major not column major so this is actually
C<c(i,j) = sum_k a(k,j) b(i,k)> - but when matrices are printed the
results will look right. Just remember the indices are reversed.
e.g.:

 $a = [                   $b = [
       [ 1  2  3  0]            [1 1]
       [ 1 -1  2  7]            [0 2]
       [ 1  0  0  1]            [0 2]
      ]                         [1 1]
                               ]

 gives $c = [
             [ 1 11]
             [ 8 10]
             [ 2  2]
            ]

Note: L<transpose()|PDL::Basic/transpose> 
does what it says and is a convenient way
to turn row vectors into column vectors.

=head2 How to write a simple function

 sub dotproduct {
     my ($a,$b) = @_;
     return sum($a*$b) ;
 }
 1;

If put in file dotproduct.pdl would be autoloaded if you
are using L<PDL::AutoLoader|PDL::AutoLoader> (see below).

Of course, this function is already available as the 
L<inner|PDL::Primitive/inner>
function, see L<PDL::Primitive>.

=head2 Type Conversion

Default for pdl() is double. Conversions are:

 $a = float($b);
 $c = long($d);   # "long" is generally a 4 byte int
 $d = byte($a);

Also double(), short(), ushort(), indx().  

  NOTE: The indx() routine is a special integer type that
  is the correct size for a PDL index value (dimension size,
  index, or offest) which can be either a 32bit (long) or
  64bit (longlong) quantity depending on whether the perl
  is built with 32bit or 64bit support.

These routines also automatically convert Perl lists to
allow the convenient shorthand:

 $a = byte [[1..10],[1..10]];  # Create 2D byte array
 $a = float [1..1000];         # Create 1D float array

etc.

=head2 Printing

Automatically expands array in N-dimensional format:

 print $a;

 $b = "Answer is = $a ";

=head2 Sections

PDL has very powerful multidimensional slicing and sectioning
operators; see L<the PDL::Slices(3) man page|PDL::Slices> for details;
we'll describe the most important one here.

PDL shows its Perl/C heritage in that arrays are zero-offset.  Thus a
100x100 image has indices C<0..99,0..99>.  (The convention is that the
I<center> of pixel (0,0) is at coordinate (0.0,0.0). All PDL graphics
functions conform to this definition and hide away the unit offsets
of, for example, the PGPLOT FORTRAN library.

Following the usual convention coordinate (0,0) is displayed
at the bottom left when displaying an image. It appears at the
top left when using "C<print $a>" etc.

Simple sectioning uses a syntax extension to Perl,
L<PDL::NiceSlice|PDL::NiceSlice>, that allows you to specify subranges
via a null-method modifier to a PDL:

  $b = $a->($x1:$x2,$y1:$y2,($z1)); # Take subsection

Here, C<$a> is a 3-dimensional variable, and C<$b> gets a planar
cutout that is defined by the limits $x1, $x2, $y1, $y2, at the location
$z1.  The parenthesis around C<$z1> cause the trivial index to be omitted --
otherwise C<$b> would be three-dimensional with a third dimension of order 1.

You can put PDL slices on either side of the element-wise assignment 
operator C<.=>, like so:

  # Set part of $bigimage to values from $smallimage
  $bigimage->($xa:$xb,$ya:$yb) .= $smallimage;

Some other miscellany:

 $c  = nelem($a); # Number of pixels

 $val = at($object, $x,$y,$z...)    # Pixel value at position, as a Perl scalar
 $val = $object->at($x,$y,$z...)    # equivalent (method syntax OK)

 $b = xvals($a); # Fill array with X-coord values (also yvals(), zvals(),
                 # axisvals($x,$axis) and rvals() for radial distance
                 # from centre).

=head2 Input/Output

The C<PDL::IO> modules implement several useful IO format functions.
It would be too much to give examples of each, but you can find a nice
overview at L<PDL::IO|PDL::IO>. Here is a sample of some of the
supported IO formats in PDL.

=over 8

=item PDL::IO::Misc

Ascii, FITS and FIGARO/NDF IO routines.

=item PDL::IO::FastRaw

Using the raw data types of your machine, an unportable but blindingly
fast IO format. Also supports memory mapping to conserve memory as
well as get more speed.

=item PDL::IO::FlexRaw

General raw data formats. Like FastRaw, only better.

=item PDL::IO::Browser

A Curses browser for arrays.

=item PDL::IO::Pnm

Portaple bitmap and pixmap support.

=item PDL::IO::Pic

Using the previous module and netpbm, makes it possible to easily
write GIF, jpeg and whatever with simple commands.

=back

=head2 Graphics

The philosophy behind perlDL is to make it work with a variety of
existing graphics libraries since no single package will satisfy all
needs and all people and this allows one to work with packages one
already knows and likes.  Obviously there will be some overlaps in
functionality and some lack of consistency and uniformity. However
this allows PDL to keep up with a rapidly developing field - the
latest PDL modules provide interfaces to OpenGL and VRML graphics!

=over 4

=item PDL::Graphics::PGPLOT

PGPLOT provides a simple library for line graphics and image display.

There is an easy interface to this in the internal module
L<PDL::Graphics::PGPLOT|PDL::Graphics::PGPLOT>, which
calls routines in the separately available
PGPLOT top-level module.

=item PDL::Graphics::PLplot

PLplot provides a simple library for creating graphics with multiple
output drivers, including a direct-to-piddle driver.

This module provides both high-level and low-level functionality built
on PLplot. The low-level commands are pretty much direct bindings to
PLplot's C interface. Read more at L<PDL::Graphics::PLplot|PDL::Graphics::PLplot>.

=item PDL::Graphics::IIS

Many astronomers like to use SAOimage and Ximtool (or there
derivations/clones). These are useful free widgets for inspection and
visualisation of images. (They are not provided with perlDL but can
easily be obtained from their official sites off the Net.)

The L<PDL::Graphics::IIS|PDL::Graphics::IIS> 
package provides allows one to display images
in these ("IIS" is the name of an ancient item of image display
hardware whose protocols these tools conform to.)

=item PDL::Graphics::TriD

See L<PDL::Graphics::TriD|PDL::Graphics::TriD>, this is a collection
of 3D routines for OpenGL and (soon) VRML and other 3D formats which
allow 3D point, line, and surface plots from PDL.

=back

=head2 Autoloading

See L<PDL::AutoLoader>. This allows one to autoload functions
on demand, in a way perhaps familiar to users of MatLab.

One can also write PDL extensions as normal Perl modules.

=head2 PDL shells

The Perl script C<pdl2> (or C<perldl>) provides a simple command line interface
to PDL.  If the latest Readlines/ReadKey modules have been installed C<pdl2>
detects this and enables command line recall and editing.
See the man page for details.

e.g.:

 % perldl
 perlDL shell v1.354
  PDL comes with ABSOLUTELY NO WARRANTY. For details, see the file
  'COPYING' in the PDL distribution. This is free software and you
  are welcome to redistribute it under certain conditions, see
  the same file for details.
 ReadLines, NiceSlice, MultiLines  enabled
 Reading PDL/default.perldlrc...
 Found docs database /home/pdl/dev/lib/perl5/site_perl/PDL/pdldoc.db
 Type 'help' for online help
 Type 'demo' for online demos
 Loaded PDL v2.4.9_003 (supports bad values)
 pdl> $x = rfits 'm51.fits'
 Reading IMAGE data...
 BITPIX =  32  size = 147456 pixels 
 Reading  589824  bytes
 BSCALE =  &&  BZERO = 

 pdl> use PDL::Graphics::PGPLOT;
 pdl> imag $x
 Displaying 384 x 384 image from 40 to 761, using 84 colors (16-99)...

You can also run it from the Perl debugger (C<perl -MPDL -d -e 1>)
if you want.

Miscellaneous shell features:

=over 4

=item p

The shell aliases C<p> to be a convenient short form of C<print>, e.g.

   pdl> p ones 5,3
   [
    [1 1 1 1 1]
    [1 1 1 1 1]
    [1 1 1 1 1]
   ]

=item Initialization

The files C<~/.perldlrc> and C<local.perldlrc> (in the current
directory) are sourced if found. This allows the user to have global
and local PDL code for startup.

=item Help

Type 'help'! One can search the PDL documentation, and look up documentation
on any function.

=item Escape

Any line starting with the C<#> character is treated as a shell
escape. This character is configurable by setting the Perl variable
C<$PERLDL_ESCAPE>. This could, for example, be set in C<~/.perldlrc>.

=back

=head2 Overload operators

The following builtin Perl operators and functions have been overloaded
to work on PDL variables:

 + - * / > < >= <= << >> & | ^ == != <=> ** % ! ~
 sin log abs atan2 sqrt cos exp

[All the unary functions (sin etc.) may be used with inplace() - see
"Memory" below.]

=head2 Object-Orientation and perlDL

PDL operations are available as functions and methods.
Thus one can derive new types of object, to represent
custom data classes.

By using overloading one can make mathematical operators
do whatever you please, and PDL has some built-in tricks
which allow existing PDL functions to work unchanged, even
if the underlying data representation is vastly changed!
See L<PDL::Objects>

=head2 Memory usage and references

Messing around with really huge data arrays may require some care.
perlDL provides many facilities to let you perform operations on big
arrays without generating extra copies though this does require a bit
more thought and care from the programmer.

NOTE: On some most systems it is better to configure Perl (during the
build options) to use the system C<malloc()> function rather than Perl's
built-in one. This is because Perl's one is optimised for speed rather
than consumption of virtual memory - this can result in a factor of
two improvement in the amount of memory storage you can use.
The Perl malloc in 5.004 and later does have a number of compile-time
options you can use to tune the behaviour.

=over

=item Simple arithmetic

If $a is a big image (e.g. occupying 10MB) then the command

 $a = $a + 1;

eats up another 10MB of memory. This is because
the expression C<$a+1> creates a temporary copy of C<$a> to hold the
result, then C<$a> is assigned a reference to that.
After this, the original C<$a> is destroyed so there is no I<permanent>
memory waste. But on a small machine, the growth in the memory footprint
can be considerable.
It is obviously done
this way so C<$c=$a+1> works as expected.

Also if one says:

 $b = $a;     # $b and $a now point to same data
 $a = $a + 1;

Then C<$b> and C<$a> end up being different, as one naively expects,
because a new reference is created and C<$a> is assigned to it.

However if C<$a> was a huge memory hog (e.g. a 3D volume) creating a copy
of it may not be a good thing. One can avoid this memory overhead in
the above example by saying:

 $a++;

The operations C<++,+=,--,-=>, etc. all call a special "in-place"
version of the arithmetic subroutine. This means no more memory is
needed - the downside of this is that if C<$b=$a> then C<$b> is also
incremented. To force a copy explicitly:

 $b = pdl $a; # Real copy

or, alternatively, perhaps better style:

 $b = $a->copy;

=item Functions

Most functions, e.g. C<log()>, return a result which is a transformation
of their argument. This makes for good programming practice. However many
operations can be done "in-place" and this may be required when large
arrays are in use and memory is at a premium. For these circumstances
the operator L<inplace()|PDL::Core/inplace> 
is provided which prevents the extra copy and
allows the argument to be modified. e.g.:

 $x = log($array);          # $array unaffected
 log( inplace($bigarray) ); # $bigarray changed in situ

WARNINGS:

=over

=item 1

The usual caveats about duplicate references apply.

=item 2

Obviously when used with some functions which can not be applied
in situ (e.g. C<convolve()>) unexpected effects may occur! We try to
indicate C<inplace()>-safe functions in the documentation.

=item 3

Type conversions, such asC<float()>, may cause hidden copying.

=back

=back

=head2 Ensuring piddleness

If you have written a simple function and
you don't want it to blow up in your face if you pass it a simple
number rather than a PDL variable. Simply call the function
L<topdl()|PDL::Core/topdl> first to make it safe. e.g.:

 sub myfiddle { my $pdl = topdl(shift); $pdl->fiddle_foo(...); ... }

C<topdl()> does NOT perform a copy if a pdl variable is passed - it
just falls through - which is obviously the desired behaviour. The
routine is not of course necessary in normal user defined functions
which do not care about internals.

=head1 AUTHOR

Copyright (C) Karl Glazebrook (kgb@aaoepp.aao.gov.au), Tuomas J. Lukka,
(lukka@husc.harvard.edu) and Christian Soeller (c.soeller@auckland.ac.nz) 1997.
Commercial reproduction of this documentation in a different format is forbidden.

=cut