The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.
#!/usr/bin/perl
# -*-cperl-*-

=head1 VENUE

Data::Rlist - A lightweight data language for Perl and C++

=cut

# $Writestamp: 2008-07-27 21:19:43 andreas$
# $Compile: perl -c Rlist.pm; pod2html --title="Random-Lists" Rlist.pm >../../Rlist.pm.html$
# $Comp1le: podchecker Rlist.pm$

=head1 SYNOPSIS

    use Data::Rlist;

File and string I/O for any Perl data F<$thing>:

    ### Compile data as text.

                  WriteData $thing, $filename;  # compile data into file
                  WriteData $thing, \$string;   # compile data into buffer
    $string_ref = WriteData $thing;             # dto.

    $string     = OutlineData $thing;           # compile printable text
    $string     = StringizeData $thing;         # compile text in a compact form (no newlines)
    $string     = SqueezeData $thing;           # compile text in a super-compact form (no whitespace)

    ### Parse data from text.

    $thing      = ReadData $filename;           # parse data from file
    $thing      = ReadData \$string;            # parse data from string buffer

F<L</ReadData>>,  F<L</WriteData>> etc.  are  L<auto-exported functions|/Exports>.   Alternately we
use:

    ### Qualified functions to parse text.

    $thing      = Data::Rlist::read($filename);
    $thing      = Data::Rlist::read($string_ref);
    $thing      = Data::Rlist::read_string($string_or_string_ref);

    ### Qualified functions to compile data into text.

                  Data::Rlist::write($thing, $filename);
    $string_ref = Data::Rlist::write_string($thing);
    $string     = Data::Rlist::write_string_value($thing);

    ### Print data to STDOUT.

    PrintData $thing;

The object-oriented interface:

    ### For objects the '-output' attribute refers to a string buffer or is a filename.
    ### The '-data' attribute defines the value or reference to be compiled into text.

    $object     = new Data::Rlist(-data => $thing, -output => \$target)

    $string_ref = $object->write;           # compile into $target, return \$target
    $string_ref = $object->write_string;    # compile into new string ($target not touched)
    $string     = $object->write_string_value; # dto. but return string value

    ### Print data to STDOUT.

    print $object->write_string_value;
    print ${$object->write};                # returns \$target

    ### Set output file and write $thing to disk.

    $object->set(-output => ".foorc");

    $object->write;                         # write "./.foorc", return 1
    $object->write(".barrc");               # write "./.barrc" (the filename overrides -output)

    ### The '-input' attribute defines the text to be compiled, either as
    ### string reference or filename.

    $object->set(-input => \$input_string); # assign some text

    $thing      = $object->read;            # parse $input_string into Perl data
    $thing      = $object->read($other_string); # parse $other_string (the argument overrides -input)

    $object->set(-input => ".foorc");       # assign some input file

    $foorc      = $object->read;            # parse ".foorc"
    $barrc      = $object->read(".barrc");  # parse some other file
    $thing      = $object->read(\$string);  # parse some string buffer
    $thing      = $object->read_string($string_or_ref); # dto.

Create deep-copies  of any Perl data.  The  metaphor "keelhaul" vividly connotes  that F<$thing> is
stringified, then compiled back:

    ### Compile a value or ref $thing into text, then parse back into data.

    $reloaded   = KeelhaulData $thing;
    $reloaded   = Data::Rlist::keelhaul($thing);

    $object     = new Data::Rlist(-data => $thing);
    $reloaded   = $object->keelhaul;

Do deep-comparisons of any Perl data:

    ### Deep-compare $a and $b and get a description of all type/value differences.

    @diffs      = CompareData($a, $b);

For more information see F<L</compile>>, F<L</keelhaul>>, and F<L</deep_compare>>.

=head1 DESCRIPTION

=head2 Venue

F<Random-Lists> (Rlist)  is a tag/value  text format, which  can "stringify" any data  structure in
7-bit ASCII text.  The basic types are lists  and scalars.  The syntax is similar, but not equal to
Perl's.  For example,

    ( "hello", "world" )
    { "hello" = "world"; }

designates two lists, the first of which is sequential, the second associative.  The format...

- allows the definition of hierachical and constant data,

- has no user-defined types, no keywords, no variables,

- has no arithmetic expressions,

- uses 7-bit-ASCII character encoding and escape sequences,

- uses C-style numbers and strings,

- has an extremely minimal syntax implementable in any programming language and system.

You can  write any  Perl data  structure into  files as legible  text.  Like  with CSV  the lexical
overhead of Rlist is minimal: files are merely data.

You can  read compiled texts back  in Perl and C++  programs.  No information will  be lost between
different program languages, and floating-point numbers keep their precision.

You can also compile structured CSV text  from Perl data, using special functions from this package
that will keep numbers precise and properly quote strings.

Since Rlist has no  user-defined types the data is structured out of  simple scalars and lists.  It
is conceivable, however, to develop a simple  type system and store type information along with the
actual data.  Otherwise the data structures are  tacit consents between the users of the data.  See
also the implemenation notes for L</Perl> and L</C++>.

=head2 Character Encoding

Rlist text uses the  7-bit-ASCII character set.  The 95 printable character  codes 32 to 126 occupy
one  character.  Codes  0 to  31 and  127  to 255  require four  characters each:  the F<\>  escape
character followed by  the octal code number.  For example, the  German Umlaut character F<E<uuml>>
(252) is translated into F<\374>.  An exception are the following codes:

    ASCII               ESCAPED AS
    -----               ----------
      9 tab               \t
     10 linefeed          \n
     13 return            \r
     34 quote     "       \"
     39 quote     '       \'
     92 backslash \       \\

=head2 Values and Default Values

F<Values> are either scalars,  array elements or the value of a pair.   Each value is constant.

The default scalar value is the empty string C<"">.  So in Perl F<undef> is compiled into C<"">.

=head2 Numbers, Strings and Here-Documents

Numbers constants adhere to the IEEE 754  syntax for integer- and floating-point numbers (i.e., the
same lexical conventions as in C and C++ apply).

Strings constants consisting only of  C<[a-zA-Z_0-9-/~:.@]> characters "look like identifiers" (aka
symbols) need  not to be  quoted.  Otherwise string  constants follow the C  language lexicography.
They strings must  be placed in double-quotes (single-quotes are not  allowed).  Quoted strings are
also escaped (i.e., characters are converted to the input character set of 7-bit ASCII).

You  can  define  a  string  using  a  line-oriented  form of  quoting  based  on  the  UNIX  shell
F<here-document> syntax and RFC 111.  Multiline quoted strings can be expressed with

    <<DELIMITER

Following the sigil F< << > an identifier  specifies how to terminate the string scalar.  The value
of the  scalar will be  all lines  following the current  line down to  the line starting  with the
delimiter (i.e., the delimiter must be at column  1).  There must be no space between the sigil and
the identifier.

B<EXAMPLES>

Quoted strings:

    "Hello, World!"

Unquoted strings (symbols, identifiers):

    foobar   cogito.ergo.sum   Memento::mori

Here-document strings:

    <<hamlet
    "This above all: to thine own self be true". - (Act I, Scene III).
    hamlet

Integegers and floats:

    38   10e-6   -.7   3.141592653589793

For more information see F<L</is_symbol>>, F<L</is_number>> and F<L</escape7>>.

=head2 List Values

We have two types of lists: sequential (aka array) and associative (aka map, hash, dictionary).

B<EXAMPLES>

Arrays:

    ( 1, 2, ( 3, "Audiatur et altera pars!" ) )

Maps:

    {
        key = value;
        standalone-key;
        Pi = 3.14159;

        "meta-syntactic names" = (foo, bar, "lorem ipsum", Acme, ___);

        var = {
            log = {
                messages = <<LOG;
    Nov 27 21:55:04 localhost kernel: TSC appears to be running slowly. Marking it as unstable
    Nov 27 22:34:27 localhost kernel: Uniform CD-ROM driver Revision: 3.20
    Nov 27 22:34:27 localhost kernel: Loading iSCSI transport class v2.0-724.<6>PNP: No PS/2 controller found. Probing ports directly.
    Nov 27 22:34:27 localhost kernel: wifi0: Atheros 5212: mem=0x26000000, irq=11
    LOG
            };
        };
    }

=head2 Binary Data

Binary data can  be represented as base64-encoded string,  or L<here-document|/Numbers, Strings and
Here-Documents> string.  For example,

    use MIME::Base64;

    $str = encode_base64($binary_buf);

The result F<$str> will be a string broken into  lines of no more than 76 characters each; the 76th
character  will be  a  newline C<"\n">.   Here  is a  complete  Perl program  that  creates a  file
F<random.rls>:

    use MIME::Base64;
    use Data::Rlist;

    our $binary_data = join('', map { chr(int rand 256) } 1..300);
    our $sample = { random_string => encode_base64($binary_data) };

    WriteData $sample, 'random.rls';

These few lines create a file F<random.rls> containing text like the following:

    {
        random_string = <<___
    w5BFJIB3UxX/NVQkpKkCxEulDJ0ZR3ku1dBw9iPu2UVNIr71Y0qsL4WxvR/rN8VgswNDygI0xelb
    aK3FytOrFg6c1EgaOtEudmUdCfGamjsRNHE2s5RiY0ZiaC5E5XCm9H087dAjUHPtOiZEpZVt3wAc
    KfoV97kETH3BU8/bFGOqscCIVLUwD9NIIBWtAw6m4evm42kNhDdQKA3dNXvhbI260pUzwXiLYg8q
    MDO8rSdcpL4Lm+tYikKrgCih9UxpWbfus+yHWIoKo/6tW4KFoufGFf3zcgnurYSSG2KRLKkmyEa+
    s19vvUNmjOH0j1Ph0ZTi2pFucIhok4krJi0B5yNbQStQaq23v7sTqNom/xdRgAITROUIoel5sQIn
    CqxenNM/M4uiUBV9OhyP
    ___
    ;
    }

Note that F<L</WriteData>>  uses the predefined C<"default"> configuration,  which enables here-doc
strings.  See also L<MIME::Base64>.

=head2 Embedded Perl Code (Nanoscripts)

Rlist text  can define embedded Perl  programs, called F<nanonscripts>.  The  embedded program text
has the form of a L<here-document|/Numbers,  Strings and Here-Documents> with the special delimiter
C<"perl">.  After  the Rlist text has  been parsed you call  F<L</evaluate_nanoscripts>> to F<eval>
all embedded Perl in the order of definiton.  The function arranges it that within the F<eval>...

=over

=item *

the F<$root> variable refers to the root of the input, as unblessed array- or hash-reference;

=item *

the F<$this> variable refers to the array or hash that stores the currently F<eval>'d nanoscript;

=item *

the F<$where> variable stores the name of the key, or the index, within F<$this>.

=back

The nanoscript  can use  this information to  oriented itself  within the parsed  data, or  even to
modify the  data in-place.  The result  of F<eval>'ing will  replace the nanoscript text.   You can
also  F<eval>  the   embedded  Perl  codes  programmatically,  using   the  F<L</nanoscripts>>  and
F<L</result>> functions.

B<EXAMPLES>

Simple example of an Rlist text that hosts Perl code:

    (<<perl)
    print "Hello, World!";
    perl

Here is a more complex example that defines a list of nanoscripts, and evaluates them:

    use Data::Rlist;

    $data = join('', <DATA>);
    $data = EvaluateData \$data;

    __END__
    ( <<perl, <<perl, <<perl, <<perl )
    print "Hello World!\n"          # english
    perl
    print "Hallo Welt!\n"           # german
    perl
    print "Bonjour le monde!\n"     # french
    perl
    print "Olá mundo!\n"            # spanish
    perl

When we execute the above script the following output is printed before the script exits:

    Hello World!
    Hallo Welt!
    Bonjour le monde!
    Olá mundo!

Note  that  when  the  Rlist  text  after  F<__END__>  is  placed  in  F<some_file>,  we  can  call
F<L</EvaluateData(C<"some_file">)>> for the same effect.  The next example modifies the parsed data
in place.  Imagine a file F<this_file_modifies_itself> with the following content:

    ( <<perl )
    ReadData(\\'{ foo = bar; }');
    perl

When we parse this file using

    $data = ReadData("this_file_modifies_itself");

to F<$data> will be assigned the following Perl value

    [ "ReadData(\\'{ foo = bar; }');\n" ]

Next we call F<Data::Rlist::L</evaluate_nanoscripts>()> to "morph" this value into

    [ { 'foo' => 'bar' } ]

The same effect can be achieved in just one call

    $data = EvaluateData("this_file_modifies_itself");

=head2 Comments

Rlist  supports multiple  forms  of comments:  F<//>  or F<#>  single-line-comments,  and F</*  */>
multi-line-comments. You may use all three forms at will.

=cut

package Data::Rlist;

use strict;
use warnings;
use Exporter;
use Carp;
use Scalar::Util qw/reftype/;
use integer;

use vars qw/$VERSION @ISA @EXPORT @EXPORT_OK %EXPORT_TAGS

            $DEBUG
            %PredefinedOptions
            $RoundScientific $SafeCppMode $EchoStderr
            $R $Fh $Locked $DefaultMaxDepth $MaxDepth $Depth
            $Errors $Warnings $Broken $MissingInput @Messages
            $DefaultCsvDelimiter $DefaultConfDelimiter $DefaultConfSeparator
            $DefaultNanoscriptToken

            $REPunctuationCharacter $REIntegerHere $REFloatHere
            $RESymbolCharacter $RESymbolHere $REStringHere
            $REInteger $REFloat
            $RESymbol $REString $REValue
            @REIsPunct @REIsDigit
           /;

# Parser/lexer variables.  Used by open_input, parse and lex. Declaring them as lexicals is
# slightly faster than to 'use vars'.

my($Readstruct, $ReadFh, $Ln, $LnArray);
my(%Rules, @VStk, @NStk);

use constant DEFAULT_VALUE => qq'""'; # default Rlist, the empty string

BEGIN {
    $VERSION = '1.44';
    $DEBUG = 0;
    @ISA = qw/Exporter/;

    # Always exported (:DEFAULT) when the package is fetched with "use", not "required".

    @EXPORT = qw/ReadCSV WriteCSV
                 ReadConf WriteConf
                 ReadData EvaluateData WriteData
                 PrintData OutlineData StringizeData SqueezeData
                 KeelhaulData CompareData/;

    # Symbols exported on request.

    @EXPORT_OK = qw/:DEFAULT

                    predefined_options complete_options

                    maybe_quote7 quote7 escape7 unquote7 unescape7 unhere
                    is_value is_random_text is_symbol is_integer is_number
                    split_quoted parse_quoted

                    equal round

                    keelhaul deep_compare fork_and_wait synthesize_pathname

                    $REInteger $REFloat $RESymbol/;

    %EXPORT_TAGS = (# Handle IEEE numbers
                    floats => [@EXPORT, qw/equal round is_number is_integer
                                          /],
                    # Handle (quoted) strings
                    strings => [@EXPORT, qw/maybe_quote7 quote7 escape7
                                            unquote7 unescape7 
                                            unhere split_quoted parse_quoted
                                            is_value is_random_text is_number is_integer is_symbol
                                           /],
                    # Compile options
                    options => [@EXPORT, qw/predefined_options complete_options
                                           /],
                    # Auxiliary functions
                    aux => [@EXPORT, qw/keelhaul deep_compare fork_and_wait synthesize_pathname
                                       /]);

    $MaxDepth = 0; $DefaultMaxDepth = 100; $Broken = 0;
    $SafeCppMode = 0;
    $EchoStderr = 0;
    $RoundScientific = 0;
    $DefaultConfSeparator = ' = ';
    $DefaultConfDelimiter = '\s*=\s*';
    $DefaultCsvDelimiter = '\s*,\s*';
    $DefaultNanoscriptToken = 'perl';

    %PredefinedOptions =
    (
     default =>
     {# Warning: "code_refs" are disabled by default because compile_fast() (the default compile
      # function) never calls subs.  Likewise the default "precision" must be undef!
      eol_space => "\n",
      bol_tabs => 1,
      outline_hashes => 0,
      outline_data => 6,
      paren_space => '',
      comma_punct => ', ',
      semicolon_punct => ';',
      assign_punct => ' = ',
      here_docs => 1,
      auto_quote => undef,      # let write() and write_csv() choose their defaults
      code_refs => 0,
      scientific => 0,
      separator => ',',
      delimiter => undef,
      precision => undef
     },

     string =>
     {
      eol_space => '',
      bol_tabs => 0,
      outline_data => 0,
      here_docs => 0
     },

     outlined =>
     {
      eol_space => "\n",
      bol_tabs => 1,
      outline_hashes => 1,
      outline_data => 1,
      paren_space => ' ',
      comma_punct => ', ',
     },

     squeezed =>
     {
      bol_tabs => 0,
      eol_space => '',
      outline_hashes => 0,
      outline_data => 0,
      here_docs => 0,
      code_refs => 0,
      paren_space => '',
      comma_punct => ',',
      assign_punct => '=',
      precision => 6,
     }
    );

    ########
    # Regular expressions for scalars
    #
    # $RESymbolHere shall be defined equal to the 'identifier' regex in 'rlist.l', to keep the
    # C/C++ and Perl implementations compatible.  See also the C++ function quote() and the
    # {identifier} rule in <rlist.l>
    #
    # In Perl regexes, by default the "^" character matches only the beginning of the string, the
    # "$" character only the end (or before the newline at the end). The "/s" modifier will force
    # "^" to match only at the beginning of the string and "$" to match only at the end (or just
    # before a newline at the end) of the string.  "$" hence ignores an optional trailing newline.
    #
    # When "/m" is used this means for "foo\nbar" the "$" matches the end of the string (after "r")
    # and also before every line break (between "o" and "\n").  Therefore we've to use "\z" which
    # matches only at the end of the string.

    $REIntegerHere = '[+-]?\d+';
    $REFloatHere = '(?:[+-]?)(?=\d|\.\d)\d*(?:\.\d*)?(?:[Ee](?:[+-]?\d+))?';
    $REPunctuationCharacter = '\=\,;\{\}\(\)';
    $RESymbolCharacter = 'a-zA-Z_0-9\-/\~:\.@';
    $RESymbolHere = '[a-zA-Z_\-/\~:@]'.qq'[$RESymbolCharacter]*';
    $REStringHere = '"[^"\\\r\n]*(?:\\.[^"\\\r\n]*)*"'; # " allowed inside the quotes, but only as \"

    $REInteger = qr/^$REIntegerHere\z/;
    $REFloat = qr/^$REFloatHere\z/;
    $RESymbol = qr/^$RESymbolHere\z/;
    $REString = qr/^$REStringHere\z/;

    $REValue = qr/$REString|
                  $REInteger|
                  $REFloat|
                  $RESymbol/x;

    $REValue = qr/^$REStringHere\z|
                  ^$REIntegerHere\z|
                  ^$REFloatHere\z|
                  ^$RESymbolHere\z/x if 0; # disabled because it is slightly slower

    ########
    # Rlist parser map:
    #
    #   token => [ rule, deduce-function ]
    #   rule  => [ rule, deduce-function ]
    #
    # See `lex()' for token meanings.

    sub syntax_error($;$) {
        my($msg, $tr) = (shift, shift||'??');
        $msg =~ s/\s/ /go; pr1nt('ERROR', $msg);
        $Errors++; $tr
    }
    sub warning($;$) {
        my($msg, $tr) = (shift, shift||'');
        $msg =~ s/\s/ /go; pr1nt('WARNING', $msg);
        $Warnings++; $tr
    }

    %Rules =
    (#
     # Key/value pairs.
     #
     # For nanoscripts (n) push hash-ref, key and the script to @NStk.
     #

     '{}'   => sub { push @VStk, { }; 'v' },
     '{h}'  => sub { 'v' },
     # first pairs (open the hash)
     'v;'   => sub { push @VStk, { pop(@VStk) => '' }; 'h' },
     'v=v;' => sub { push @VStk, { splice @VStk, -2 }; 'h' },
     'v=n;' => sub { my($k, $v) = splice @VStk, -2;
                     my $h = { $k => $v };
                     push @VStk, $h; push @NStk, [ $h, $k ]; 'h' },
     # subsequent pairs (complete the hash)
     'hv;'  => sub { my $k      = pop @VStk;        $VStk[$#VStk]->{$k} = ''; 'h' },
     'hv=v' => sub { my($k, $v) = splice @VStk, -2; $VStk[$#VStk]->{$k} = $v; 'h' },
     'hv=n' => sub { my($k, $v) = splice @VStk, -2; $VStk[$#VStk]->{$k} = $v; push @NStk, [ $VStk[$#VStk], $k ]; 'h' },
     'h;'   => sub { 'h' },

     #
     # Single values/scripts.
     #

     '()'   => sub { push @VStk, [ ]; 'v' },
     '(l)'  => sub { 'v' },
     '(v)'  => sub {                    push @VStk, [pop(@VStk)]; 'v' },
     '(n)'  => sub { my $v = pop @VStk; push @VStk, [ $v ]; push @NStk, [ $VStk[$#VStk], 0 ]; 'v' },
     'v,'   => sub {                    push @VStk, [pop(@VStk)]; 'l,' },
     'n,'   => sub { my $v = pop @VStk; push @VStk, [ $v ]; push @NStk, [ $VStk[$#VStk], 0 ]; 'l,' },
     'l,v'  => sub { my $v = pop @VStk; push @{$VStk[$#VStk]}, $v; 'l' }, # push to existing list
     'l,n'  => sub { my $v = pop @VStk; push @{$VStk[$#VStk]}, $v; push @NStk, [ $VStk[$#VStk], $#{$VStk[$#VStk]} ]; 'l' },

     #
     # Rules for syntax errors.  All rules containing '??' are error-recovery-rules.
     #

     '=??'  => sub { syntax_error("invalid value after '='", ';') },
     '??;'  => sub { syntax_error("invalid key/value before ';'", ';') },
     ',??'  => sub { push @VStk, ''; syntax_error("invalid value after ','", ',v') },
     '??'   => sub { '' },

     'vv'   => sub { my($k, $v) = splice @VStk, -2; syntax_error("missing ',' or ';'") },
     'v=v}' => sub { my($k, $v) = splice @VStk, -2; push @VStk, { $k => $v }; warning("unterminated pair: expected ';'", 'h}') },
     'v=v,' => sub { my($k, $v) = splice @VStk, -2; warning("pair terminated with ',': expected ';'", '??') },
     'v=;'  => sub { warning("missing value, or superfluous '='", 'v;') },
     'v=}'  => sub { warning("missing value: expected ';', not '}'", 'v;') },
     '(v}'  => sub { my $v = pop @VStk; syntax_error("expected ')' after value, not '}'") },
     '{v)'  => sub { my $v = pop @VStk; syntax_error("expected '(' before value, not '{'") },
     '{v}'  => sub { my $k = pop @VStk; push @VStk, { $k => '' }; warning("unterminated pair: expected ';'", 'h') },

     '(v,)' => sub { warning("superfluous ',' at end of list", '(v)') },
     '(l,)' => sub { warning("superfluous ',' at end of list",  'v') },

     '{{'   => sub { warning("non-scalar hash-key", '??') },
     '{('   => sub { warning("non-scalar hash-key", '??') },

     'n;'   => sub { warning("nanoscript ignored: shall be def'd as value, not key", 'v;') },
     'n=v;' => sub { warning("nanoscript ignored: shall be def'd as value, not key", 'v=v;') },
    );

    # True syntax errors, which cannot be converted into valid rules.  The error will be printed
    # and recorded in @Messages when '??' is actually reduced.

    foreach my $errrule ((',,', ',;', ';,', ';;',
                          '{=', '{,', '{;',
                          '(=', '(,', '(;',
                          '==',
                          '(v;', '(n;',
                          'v=,', 'v=)')) {
        die if exists $Rules{$errrule};
        $Rules{$errrule} = eval(<<___);
    sub { my \@r = map { s/\\s+/ /g; \$_ } map { if (/[vnhl]/) { pop(\@VStk) }; s/v/value/; s/n/nanoscript/; s/h/hash/; s/l/list/; \$_ }
                   split / */, '$errrule';
          return syntax_error("'".join(' ', \@r)."'"); }
___
    }

    my($rule_max, $rule_min) = (0, 9);
    foreach (keys %Rules) {
        $rule_min = length($_) if length($_) < $rule_min;
        $rule_max = length($_) if length($_) > $rule_max;
    }
    die $rule_min if $rule_min != 2;
    die $rule_max if $rule_max != 4;
}

sub pr1nt(@)
{
    # This function is used to write a new comment line (usually some sort of error message) into
    # the currently compiled file, and to STDERR (if $Data::Rlist::DEBUG).

    my $label = shift;
    my $msg = join(': ', grep { length }
                   ($label,
                    ((defined($Readstruct) &&
                      exists $Readstruct->{filename}) ? $Readstruct->{filename}."($.)" : ""),
                    grep { defined } @_))."\n";
    foreach my $fh (grep { defined } ($Fh, $EchoStderr ? *STDERR{IO} : undef)) {
        next unless defined $fh;
        print $fh map { $fh == defined($Fh) ? "# $_" : $_ } $msg;
    }
    push @Messages, $msg;
}

=head1 PACKAGE INTERFACE

The  core  functions to  cultivate  package objects  are  F<L</new>>,  F<L</dock>>, F<L</set>>  and
F<L</get>>.  When a regular package function is called in object context some omitted arguments are
read from object attributes.  This is  true for the following functions: F<L</read>>, F<L</write>>,
F<L</read_string>>,   F<L</write_string>>,  F<L</read_csv>>,   F<L</write_csv>>,  F<L</read_conf>>,
F<L</write_conf>> and F<L</keelhaul>>.

Unless called  in object  context the first  argument has  an indifferent meaning  (i.e., it  is no
F<Data::Rlist> reference).  Then F<L</read>> expects an  input file or string, F<L</write>> the data
to compile etc.

=head2 Construction

=over

=item F<new([ATTRIBUTES])>

Create a F<Data::Rlist> object from the hash ATTRIBUTES. For example,

    $self = Data::Rlist->new(-input => 'this.dat',
                             -data => $thing,
                             -output => 'that.dat');

For   this   object   the   call   F<L<$self-E<gt>read()|/read>>  reads   from   F<this.dat>,   and
F<L<$self-E<gt>write()|/write>> writes any Perl data F<$thing> to F<that.dat>.

B<REGULAR OBJECT ATTRIBUTES>

=over 8

=item C<-input =E<gt> INPUT>

=item C<-filter =E<gt> FILTER>

=item C<-filter_args =E<gt> FILTER-ARGS>

Defines what  Rlist text to  parse and  how to preprocess  an input file.   INPUT is a  filename or
string reference.  FILTER can be 1 to  select the standard C preprocessor F<cpp>.  These attributes
are applied by F<L</read>>, F<L</read_string>>, F<L</read_conf>> and F<L</read_csv>>.

=item C<-data =E<gt> DATA>

=item C<-options =E<gt> OPTIONS>

=item C<-output =E<gt> OUTPUT>

Defines  the Perl  data to  be L<compiled|/compile>  into  text (DATA),  how it  shall be  compiled
(OPTIONS) and  where to  store the  compiled text (OUTPUT).   When OUTPUT  is string  reference the
compiled text  will be stored  in that string.   When OUTPUT is F<undef>  a new string  is created.
When OUTPUT  is a string  value it is  a filename.  These  attributes are applied  by F<L</write>>,
F<L</write_string>>, F<L</write_conf>>, F<L</write_csv>> and F<L</keelhaul>>.

=item C<-header =E<gt> HEADER>

Defines an array  of text lines, each of which will  by prefixed by a F<#> and  then written at the
top of the output file.

=item C<-delimiter =E<gt> DELIMITER>

Defines the field delimiter for F<.csv>-files. Applied by F<L</read_csv>> and F<L</read_conf>>.

=item C<-columns =E<gt> STRINGS>

Defines the column names for F<.csv>-files to be written into the first line.

=back

B<ATTRIBUTES THAT MASQUERADE PACKAGE GLOBALS>

The attributes  listed below raise  new values for  package globals for  the time an  object method
runs.

=over

=item C<-InputRecordSeparator =E<gt> FLAG>

Masquerades F<$/>, which affects  how lines are read and written to  and from Rlist- and CSV-files.
You may also set F<$/> by yourself.  See L<perlport> and L<perlvar>.

=item C<-MaxDepth =E<gt> INTEGER>

=item C<-SafeCppMode =E<gt> FLAG>

=item C<-RoundScientific =E<gt> FLAG>

Masquerade  F<L<$Data::Rlist::MaxDepth|Debugging Data>>, F<L<$Data::Rlist::SafeCppMode|open_input>>
and F<L<$Data::Rlist::RoundScientific|round>>.

=item C<-EchoStderr =E<gt> FLAG>

Print read errors and warnings message on STDERR (default: off).

=item C<-DefaultCsvDelimiter =E<gt> REGEX>

=item C<-DefaultConfDelimiter =E<gt> REGEX>

Masquerades F<$Data::Rlist::DefaultCsvDelimiter>  and F<$Data::Rlist::DefaultConfDelimiter>.  These
globals define  the default regexes  to use  when the F<-options>  attribute does not  specifiy the
L<C<"delimiter">|/Compile Options> regex.  Applied by F<L</read_csv>> and F<L</read_conf>>.

=item C<-DefaultConfSeparator =E<gt> STRING>

Masquerades F<$Data::Rlist::DefaultConfSeparator>,  the default string to use  when the F<-options>
attribute   does  not  specifiy   the  L<C<"separator">|/Compile   Options>  string.    Applied  by
F<L</write_conf>>.

=back

=item F<dock(SELF, SUB)>

Localize object  SELF within the  package and run  SUB.  This means  that some of  SELF's attribute
masqquerade  few  package globals  for  the  time  SUB runs.   SELF  then  locks the  package,  and
F<$Data::Rlist::Locked> is greater than 0.

=back

=head2 Attribute Access

=over

=item F<set(SELF[, ATTRIBUTE]...)>

Reset or initialize object attributes, then return SELF.  Each ATTRIBUTE is a name/value-pair.  See
F<L</new>> for a list of valid names.  For example,

    $obj->set(-input => \$str, -output => 'temp.rls', -options => 'squeezed');

=item F<get(SELF, NAME[, DEFAULT])>

=item F<require(SELF[, NAME])>

=item F<has(SELF[, NAME])>

Get some  attribute NAME from object SELF.   Unless NAME exists returns  DEFAULT.  The F<require>
method has  no default value,  hence it dies  unless NAME exists.  F<has> returns true  when NAME
exists, false otherwise.  For NAME the leading hyphen is optional.  For example,

    $self->get('foo');          # returns $self->{-foo} or undef
    $self->get(-foo=>);         # dto.
    $self->get('foo', 42);      # returns $self->{-foo} or 42

=back

=cut

sub new {
    my($prototype, $k) = shift;
    carp <<___ if @_ & 1;
$prototype->Data::Rlist::new(${\(join(', ', @_))})
    odd number of arguments supplied, expecting key/value pairs
___
    my %args = @_;
    bless { map { $k = $_;
                  s/^_+//o;         # remove leading underscores
                  s/^([^\-])/-$1/o; # prepend missing '-'
                  $_ => $args{$k}
              } keys %args }, ref($prototype) || $prototype;
}

sub set {
    my($self) = shift;
    my %attr = @_;
    while(my($k, $v) = each %attr) {
        $self->{$k} = $v
    } $self
}

sub require($$) {               # get attribute or confess
    my($self, $attr) = @_;
    my $v = $self->get($attr);
    confess "$self->require(): missing '$attr' attribute:\n\t\t".join("\n\t\t", map { "$_ = $self->{$_}" } keys %$self) unless defined $v;
    return $v;
}

sub get($$;$) {                 # get attribute or return default value/undef
    my($self, $attr, $default) = @_;
    $attr = '-'.$attr unless $attr =~ /^-/;
    return $self->{$attr} if exists $self->{$attr};
    return $default;
}

sub has($$) {
    my($self, $attr) = @_;
    $attr = '-'.$attr unless $attr =~ /^-/;
    exists $self->{$attr};
}

sub dock($\&) {
    carp "package Data::Rlist locked" if $Locked++; # TODO: use critical sections and atomic increment
    my ($self, $block) = @_;
    local $MaxDepth = $self->get(-MaxDepth=>) if $self->has(-MaxDepth=>);
    local $SafeCppMode = $self->get(-SafeCppMode=>) if $self->has(-SafeCppMode=>);
    local $EchoStderr = $self->get(-EchoStderr=>) if $self->has(-EchoStderr=>);
    local $RoundScientific = $self->get(-RoundScientific=>) if $self->has(-RoundScientific=>);
    local $DefaultCsvDelimiter = $self->get(-DefaultCsvDelimiter=>) if $self->has(-DefaultCsvDelimiter=>);
    local $DefaultConfDelimiter = $self->get(-DefaultConfDelimiter=>) if $self->has(-DefaultConfDelimiter=>);
    local $DefaultConfSeparator = $self->get(-DefaultConfSeparator=>) if $self->has(-DefaultConfSeparator=>);
    local $DefaultNanoscriptToken = $self->get(-DefaultNanoscriptToken=>) if $self->has(-DefaultNanoscriptToken=>);
    local $DEBUG = $self->get(-DEBUG=>) if $self->has(-DEBUG=>);
    local $/ = $self->get(-InputRecordSeparator=>) if $self->has(-InputRecordSeparator=>);
    local $R;
    unless (defined wantarray) { # void context
        $block->(); --$Locked;
    } elsif (wantarray) {
        my @r = $block->(); --$Locked; return @r;
    } else {
        my $r = $block->(); --$Locked; return $r;
    }
}

=head2 Public Functions

=over

=item F<read(INPUT[, FILTER, FILTER-ARGS])>

Parse data from INPUT, which specifies some Rlist-text.  See also F<L</errors>>, F<L</write>>.

B<PARAMETERS>

INPUT shall be either

- some Rlist object created by F<L</new>>,

- a string reference, in which case F<read> and F<L</read_string>> parse Rlist text from it,

- a string scalar, in which case F<read> assumes a file to parse.

See F<L</open_input>>  for the FILTER and FILTER-ARGS  parameters, which are used  to preprocess an
input file.  When an input file cannot  be F<open>'d and F<flock>'d this function dies.  When INPUT
is  an  object,  arguments for  FILTER  and  FILTER-ARGS  eventually  override the  F<-filter>  and
F<-filter_args> attributes.

B<RESULT>

The parsed data as array- or hash-reference, or  F<undef> if there was no data. The latter may also
be the case when file consist only of comments/whitespace.

B<NOTES>

This function  may die.  Dying  is Perl's  mechanism to raise  exceptions, which eventually  can be
catched with F<eval>.  For example,

    my $host = eval { use Sys::Hostname; hostname; } || 'some unknown machine';

This code fragment  traps the F<die> exception, so  that F<eval> returns F<undef> or  the result of
calling F<hostname>. The following example uses F<eval> to trap exceptions thrown by F<read>:

    $object = new Data::Rlist(-input => $thingfile);
    $thing = eval { $object->read };

    unless (defined $thing) {
        if ($object->errors) {
            print STDERR "$thingfile has syntax errors"
        } else {
            print STDERR "$thingfile not found, is locked or empty"
        }
    } else {
        # Can use $thing
            .
            .
    }

=item F<read_csv(INPUT[, OPTIONS, FILTER, FILTER-ARGS])>

=item F<read_conf(INPUT[, OPTIONS, FILTER, FILTER-ARGS])>

Parse data from INPUT, which specifies some comma-separated-values (CSV) text.  Both functions

- read data from strings or files,

- use an optional delimiter,

- ignore delimiters in quoted strings,

- ignore empty lines,

- ignore lines begun with F<#>.

F<read_conf> is a variant of F<read_csv> dedicated to configuration files. Such files consist
of lines of the form

    key = value

B<PARAMETERS>

For INPUT see F<L</read>>.  For FILTER,  FILTER-ARGS see F<L</open_input>>.

OPTIONS  can be  used to  override the  L<C<"delimiter">|/Compile Options>  regex.  For  example, a
delimiter of C<'\s+'>  splits the line at horizontal whitespace into  multiple values (with respect
of quoted strings).   For F<read_csv> the delimiter defaults to  C<'\s*,\s*'>, and for F<read_conf>
to C<'\s*=\s*'>.  See also F<L</write_csv>> and F<L</write_conf>>.

B<RESULT>

Both functions return a list of lists.  Each embedded array defines the fields in a line.

B<EXAMPLES>

Un/quoting of values happens implicitly.  Given a file F<db.conf>

    # Comment
    SERVER      = hostname
    DATABASE    = database_name
    LOGIN       = "user,password"

the call F<$opts=ReadConf(C<"db.conf">)> assigns

    [ [ 'SERVER', 'hostname' ],
      [ 'DATABASE', 'database_name' ],
      [ 'LOGIN', 'user,password' ]
    ]

The F<L</WriteConf>> function can be used to create or update the configuration:

    push @$opts, [ 'MAGIC VALUE' => 3.14_15 ];

    WriteConf('db.conf', { precision => 2 });

This writes to F<db.conf>:

    SERVER = hostname
    DATABASE = database_name
    LOGIN = "user,password"
    "MAGIC VALUE" = 3.14

=item F<read_string(INPUT)>

Calls F<L</read>>  to parse Rlist language  productions from the string  or string-reference INPUT.
When INPUT is an object do this for its F<-input> attribute.

=item F<result([SELF])>

Return  the last  result  of  calling F<L</read>>,  which  is either  F<undef>  or  some array-  or
hash-reference.  When SELF is passed as object  reference, returns the result that occured the last
time SELF had called F<L</read>>.

=item F<nanoscripts([SELF])>

In list context return an array of nanoscripts  defined by the last call to F<L</read>>.  When SELF
is passed return this information for the last time SELF had called F<L</read>>. The result has the
form:

    ( [ $hash_or_array_ref, $key_or_index ], # 1st nanoscript
      [ $hash_or_array_ref, $key_or_index ], # 2nd nanoscript
        .
        .
        .
    )

In scalar context  return a reference to the  above.  This information defines the  location of all
embedded Perl  scripts within the result,  and can be  used to F<eval> them  programmatically.  See
also F<L</result>>, F<L</evaluate_nanoscripts>>.

=item F<evaluate_nanoscripts([SELF])>

Evaluates all nanoscripts defined by the last call to F<L</read>>.  When called as method evaluates
the  nanoscripts defined  by the  last time  SELF had  called F<L</read>>.   Returns the  number of
scripts or  0 if none  were available.  Each  script is replaced by  the result of  F<eval>'ing it.
(For details and examples see L</Embedded Perl Code (Nanoscripts)>.)

=item F<messages([SELF])>

In  list context  returns  a list  of  compile-time messages  that  occurred in  the  last call  to
F<L</read>>.  In scalar context returns an array  reference.  When an package object SELF is passed
returns the information for the last time SELF had called F<L</read>>.

=item F<errors([SELF])>

=item F<warnings([SELF])>

Returns the  number of syntax errors  and warnings that occurred  in the last  call to F<L</read>>.
When called as method returns the number that occured the last time SELF had called F<L</read>>.

Example:

    use Data::Rlist;

    our $data = ReadData 'things.rls';

    if (Data::Rlist::errors() || Data::Rlist::warnings()) {
        print join("\n", Data::Rlist::messages())
    } else {
        # Ok, $data is an array- or hash-reference.
        die unless $data;

    }

=item F<broken([SELF])>

Returns the  number of times the last  F<L</compile>> violated F<L<$Data::Rlist::MaxDepth|Debugging
Data>>.  When  called  as method  returns  the  information  for  the  last time  SELF  had  called
F<L</compile>>.

=item F<missing_input([SELF])>

Returns true  when the last  call to  F<L</parse>> yielded F<undef>,  because there was  nothing to
parse.   When  called  as method  returns  the  information  for  the  last time  SELF  had  called
F<L</parse>>.

=item F<write(DATA[, OUTPUT, OPTIONS, HEADER])>

Transliterates Perl data into  Rlist text and write the text to a  file or string buffer.  F<write>
is auto-exported as F<L</WriteData>>.

B<PARAMETERS>

DATA is either an object generated by F<L</new>>,  or any Perl data including F<undef>.  In case of
an object  the actual DATA  value is defined  by its F<-data>  attribute. (When F<-data>  refers to
another Rlist object, this other object is invoked.)

OUTPUT defines the  output location, as filename, string-reference or  F<undef>.  When F<undef> the
function allocates  a string  and returns  a reference to  it.  OUTPUT  defaults to  the F<-output>
attribute when DATA defines an object.

OPTIONS  define how  to compile  DATA: when  F<undef> or  C<"fast"> uses  F<L</compile_fast>>, when
C<"perl">  uses  F<L</compile_Perl>>,  otherwise   F<L</compile>>.   Defaults  to  the  F<-options>
attribute when DATA is an object.

HEADER is  a reference to  an array of  strings that shall  be printed literally  at the top  of an
output file. Defaults to the F<-header> attribute when DATA is an object.

B<RESULT>

When F<write> creates a  file it returns 0 for failure or 1 for  success.  Otherwise it returns a
string reference.

B<EXAMPLES>

    $self = new Data::Rlist(-data => $thing, -output => $output);

    $self->write;   # Compile $thing into a file ($output is a filename)
                    # or string ($output is a string reference).

    Data::Rlist::write($thing, $output);    # dto., but using the functional interface.

=item F<write_csv(DATA[, OUTPUT, OPTIONS, COLUMNS])>

=item F<write_conf(DATA[, OUTPUT, OPTIONS, HEADER])>

Write  DATA as  comma-separated-values  (CSV) to  file  or string  OUTPUT.  F<write_conf>  writes
configuration  files where  each  line contains  a  tagname, a  separator and  a  value.

B<PARAMETERS>

DATA is either  an object, or defines the data  to be compiled as reference to  an array of arrays.
F<write_conf> uses only the first and second fields. For example,

    [ [ a, b, c ],      # fields of line 1
      [ d, e, f, g ],   # fields line 2
        .
        .
    ]



OPTIONS  specifies  the  comma-separator  (C<"separator">),  how to  quote  (C<"auto_quote">),  the
linefeed (C<"eol_space">) and the numeric precision (C<"precision">).  COLUMNS specifies the column
names to be written to the first line.  Likewise  the text from the HEADER array is written in form
of F<#>-comments at the top of an output file.

B<RESULT>

When a  file was  created both function  return 0 for  failure, or  1 for success.   Otherwise they
return a reference to the compiled text.

B<EXAMPLES>

Functional interface:

    use Data::Rlist;            # imports WriteCSV

    WriteCSV($thing, "foo.dat");

    WriteCSV($thing, "foo.dat", { separator => '; ' }, [qw/GBKNR VBKNR EL LaD/]);

    WriteCSV($thing, \$target_string);

    $string_ref = WriteCSV($thing);

Object-oriented interface:

    $object = new Data::Rlist(-data => $thing, -output => "foo.dat",
                              -options => { separator => '; ' },
                              -columns => [qw/GBKNR VBKNR EL LaD LaD_V/]);

    $object->write_csv;         # write $thing as CSV to foo.dat
    $object->write;             # write $thing as Rlist to foo.dat

    $object->set(-output => \$target_string);

    $object->write_csv;         # write $thing as CSV to $target_string

See also F<L</write>> and F<L</read_csv>>.

=item F<write_string(DATA[, OPTIONS])>

Stringify any Perl data  and return a reference to the string.   Works like F<L</write>> but always
compiles  to a  new string  to which  it  returns a  reference.  The  default for  OPTIONS will  be
L<C<"string">|/Predefined Options>.

=item F<write_string_value(DATA[, OPTIONS])>

Stringify  any  Perl  dats  and  return  the  compiled  text  string  value.   OPTIONS  default  to
L<C<"default">|/Predefined Options>.  For example,

    print "\n\$thing dumped: ", Data::Rlist::write_string_value($thing);

    $self = new Data::Rlist(-data => $thing);

    print "\nsame \$thing dumped: ", $self->write_string_value;

=item F<keelhaul(DATA[, OPTIONS])>

Do a deep copy of DATA according  to L<OPTIONS|/Compile Options>.  First the function compiles DATA
to Rlist text, then restores the data  from exactly this text.  This process is called "keelhauling
data", and allows us to

- adjust the accuracy of numbers, 

- break circular-references,

- drop F<\*foo{THING}>s,

- bring multiple data sets to the same, common basis.

It is useful (e.g.)  when  DATA had been hatched by some other code, and  you don't know whether it
is hierachical, or if typeglob-refs nist inside.  Then  keelhaul it to clean it from its past.  For
example, to bring all numbers in

    $thing = { foo => [ [ .00057260 ], -1.6804e-4 ] };

to a certain accuracy, use

    $deep_copy_of_thing = Data::Rlist::keelhaul($thing, { precision => 4 });

All number scalars in  F<$thing> are rounded to 4 decimal places,  so they're finally comparable as
floating-point numbers.  To F<$deep_copy_of_thing> is assigned the hash-reference

    { foo => [ [ 0.0006 ], -0.0002 ] }

Likewise one can convert all floats to integers:

    $make_integers = new Data::Rlist(-data => $thing, -options => { precision => 0 });

    $thing_without_floats = $make_integers->keelhaul;

When F<L</keelhaul>> is called in an array context it also returns the text from which the copy had
been built.  For example,

    $deep_copy = Data::Rlist::keelhaul($thing);

    ($deep_copy, $rlist_text) = Data::Rlist::keelhaul($thing);

    $deep_copy = new Data::Rlist(-data => $thing)->keelhaul;

B<DETAILS>

F<L</keelhaul>> won't throw F<die> nor return an error, but be prepared for the following effects:

=over

=item *

F<ARRAY>, F<HASH>, F<SCALAR> and F<REF> references were compiled, whether blessed or not.  (Since
compiling does not store type information, F<keelhaul> will turn blessed references into barbars
again.)

=item *

F<IO>, F<GLOB> and F<FORMAT> references have been converted into strings.

=item *

Depending on the compile options, F<CODE> references are invoked, deparsed back into their function
bodies, or dropped.

=item *

Depending on the compile options floats are rounded, or are converted to integers.

=item *

F<undef>'d array elements are converted into the default scalar value C<"">.

=item *

Unless F<$Data::Rlist::MaxDepth> is 0, anything deeper than F<$Data::Rlist::MaxDepth> will be
thrown away.

=item *

When the data contains objects, no special methods are triggered to "freeze" and "thaw" the
objects.

=back

See also F<L</compile>> and F<L</deep_compare>>

=back

=head2 Static Functions

=over

=item F<predefined_options([PREDEF-NAME])>

Return   are   predefined   hash-reference    of   compile   otppns.    PREDEF-NAME   defaults   to
L<C<"default">|/Predefined Options>.

=item F<complete_options([OPTIONS[, BASICS]])>

Completes OPTIONS  with BASICS, so that  all pairs not already  in OPTIONS are  copied from BASICS.
Always returns a new hash-reference, i.e., neither OPTIONS nor BASICS are modified.  Both arguments
define  hashes  or  some  L<predefined  options  name|/Predefined  Options>.   BASICS  defaults  to
L<C<"default">|/Predefined Options>.  For example,

    $options = complete_options({ precision => 0 }, 'squeezed')

merges  the  predefined  options  for  L<C<"squeezed"> text|/Predefined  Options>  with  a  numeric
precision of 0  (converts all floats to  integers).

=back

=cut

sub is_integer(\$);
sub is_number(\$);
sub is_symbol(\$);
sub is_random_text(\$);

sub read($;$$);
sub read($;$$) {
    my($input, $fcmd, $fcmdargs) = @_;

    if (ref($input) eq __PACKAGE__) {
        $input->dock(sub {
                         unless ($fcmd) {
                             $fcmd = $input->get('-filter');
                             $fcmdargs = $input->get('-filter_args');
                         }
                         $R = Data::Rlist::read($input->require(-input=>), $fcmd, $fcmdargs); # returns a reference
                         $input->set(-read_result => [$Warnings, $Errors, $Broken, $MissingInput, \@Messages]);
                         $input->set(-nanoscripts => (@NStk ? [@NStk] : undef));
                         $input->set(-result => $R);
                         $R
                     }
                    )
    } else {
        # $input is either a string (filename) or reference.
        local $| = 1 if $DEBUG;
        if ($DEBUG) {
            print STDERR "Data::Rlist::open_input($input, $fcmd, $fcmdargs)\n" if $fcmd && $fcmdargs;
            print STDERR "Data::Rlist::open_input($input, $fcmd)\n" if $fcmd && !$fcmdargs;
            print STDERR "Data::Rlist::open_input($input)\n" unless $fcmd;
        }
        return undef unless open_input($input, $fcmd, $fcmdargs);
        confess unless defined $Readstruct;
        my $data = parse();
        print STDERR "Data::Rlist::close_input() parser result = ", (defined $data) ? $data : 'undef', "\n" if $DEBUG;
        close_input();
        return $data;
    }
}

sub read_csv($;$$$);
sub read_csv($;$$$) {
    my($input, $options, $fcmd, $fcmdargs) = @_;

    if (ref($input) eq __PACKAGE__) {
        $input->dock
        (sub {
             $options ||= $input->get('options');
             $fcmd ||= $input->get('filter');
             $fcmdargs ||= $input->get('filter_args');
             $input = $input->get('input');
             Data::Rlist::read_csv($input, $options, $fcmd, $fcmdargs);
         });
    } else {
		# Call open_input, let lexln read all lines, call close_input.  $input names a file or a
		# string-ref (buffer); from both we're reading linewise.  For strings open_input does not
		# call read_csv, but splits at LF or CR+LF.  Since lexln only chomps $/ we explicitly check
		# for a trailing \r here.

        return undef unless open_input($input, $fcmd, $fcmdargs);
        confess unless defined $Readstruct;
        my $delim = complete_options($options)->{delimiter} || $DefaultCsvDelimiter;
        my @L; push @L, $Ln while lexln();
        my @R; push @R, map { [ map { maybe_unquote7($_) } split_quoted($_, $delim) ] }
        grep { not /^\s*#|^\s*$/o } # throw away comment lines and blank lines
        #map { s/\r+$//o; $_ }		# strip trailing \r
		@L;
        close_input();
        return \@R;
    }
}

sub read_conf(@) { 
    my($input, $options, $fcmd, $fcmdargs) = @_;
    $options ||= $input->get('options') if ref($input) eq __PACKAGE__;
    $options = complete_options($options) unless ref $options; # expand using predef'd set "default"
    $options->{delimiter} ||= $DefaultConfDelimiter;           # ...where "delimiter" is undef
    return read_csv($input, $options, $fcmd, $fcmdargs);
}

sub read_string($);
sub read_string($) {
    my $r = shift;
    if (defined($r) and not defined reftype($r)) {
        return read_string(\$r);
    } elsif (reftype($r) ne 'SCALAR') {
        carp 'string or string-reference required';
    } Data::Rlist::read($r);
}

sub result(;$) {
    my $self = shift;
    return $self->get(-result=>) if $self;
    return $R;
}

sub nanoscripts(;$) {
    return unless defined wantarray;
    my $self = shift;
    my $ls = $self ? $self->get(-nanoscripts=>) : \@NStk;
    return wantarray ? @$ls : $ls;
}

sub evaluate_nanoscripts(;$)
{
    my($self) = @_;
    my @ns = nanoscripts($self);
    my $root = result($self);   # this is $Data::Rlist::R or $self->{'-result'}
    my($this, $where);

    foreach my $ns (@ns) {
        $this = $ns->[0];       # list in which the nanoscript occurs
        $where = $ns->[1];      # key or index into the list
        if (ref($this) =~ 'ARRAY') {
            my $i = int($where);
            my $code = $this->[$i];
            print "$root: evaluating nanoscript $this\->[$i]:\n\t${\(escape7($code))}\n" if $DEBUG;
            $this->[$i] = eval $code;
            print "\n\tresult: $this->[$i]\n" if $DEBUG;
        } else {
            die unless ref($this) =~ 'HASH';
            my $code = $this->{$where};
            print "$root: evaluating nanoscript $this\->{$where}:\n\t${\(escape7($code))}\n" if $DEBUG;
            $this->{$where} = eval $code;
            print "\n\tresult: $this->{$where}\n" if $DEBUG;
        }
    }
    return $#ns + 1;
}

sub warnings(;$) {
    my $self = shift;
    if ($self) {
        my $a = $self->get(-read_result=>);
        return $a->[0] if ref $a;
        return 0;
    } $Warnings
}

sub errors(;$) {
    my $self = shift;
    if ($self) {
        my $a = $self->get(-read_result=>);
        return $a->[1] if ref $a;
        return 0;
    } $Errors
}

sub broken(;$) {
    my $self = shift;
    if ($self) {
        my $a = $self->get(-read_result=>);
        return $a->[2] if ref $a;
        return 0;
    } $Broken
}

sub missing_input(;$) {
    my $self = shift;
    if ($self) {
        my $a = $self->get(-read_result=>);
        return $a->[3] if ref $a;
        return 0;
    } $MissingInput
}

sub messages(;$) {
    return unless defined wantarray; # void context
    my $self = shift;
    if ($self) {
        my $a = $self->get(-read_result=>);
        return @{$a->[4]} if ref $a;
    } return wantarray ? @Messages : \@Messages
}

sub predefined_options($) {
    my $name = shift || 'default';
    carp "\nunknown compile-options '$name'" unless exists $PredefinedOptions{$name};
    $PredefinedOptions{$name};
}

sub complete_options(;$$);
sub complete_options(;$$)
{
    my($opts, $base) = (shift||'default', shift||'default');
    my $using_default = ($base eq 'default');
    $opts = predefined_options($opts) unless ref $opts;
    $base = predefined_options($base) unless ref $base;

    # Make a new hash, copy all keys not already in $opts from $base.
    $opts = { %$opts };
    $opts->{_base} = ref($base) ? 'some hash' : $base;
    while (my($k, $v) = each %$base) {
        $opts->{$k} = $v unless exists $opts->{$k}
    }

    # Finally complete $opts with "default" and return the new hash.
    $opts = complete_options($opts) unless $using_default;
    $opts
}

sub write($;$$$);
sub write($;$$$)
{
    my($data, $output) = (shift, shift);
    my($options, $header) = @_;
    local $| = 1 if $DEBUG;

    if (ref($data) eq __PACKAGE__) {
        $data->dock(sub {
						$output ||= $data->get('-output');
						$options ||= $data->get('-options');
						$header ||= $data->get('-header');
						Data::Rlist::write($data->get('-data'), $output, $options, $header) });
    } else {
        # $data is any Perl data or undef.  Reset package globals, validate $options, then compile
        # $data.

        my $to_string = ref $output || not defined $output;
        my($result, $optname, $fast, $perl);
        $options ||= ($to_string ? 'string' : 'fast');
        unless (ref $options) {
            $fast = 1 if $options eq 'fast';
            $perl = 1 if $options eq 'perl';
            $optname = "'$options'";
            $options = predefined_options($options) unless $fast || $perl;
        } else {
            $optname = "custom, based on '${\($options->{_base} || 'default')}'";
        }
        unless ($fast || $perl) {
            $options->{auto_quote} = 1 unless defined $options->{auto_quote};
        }

        unless ($to_string) {
            # Compile $data into a file named $output.  Create a new file, exclusively lock it. It
            # is guaranteed that no other process will be able to run flock(FH,2) on the same file
            # while we hold the lock. (Because the OS suspends and blocks other processes.)

            confess $output if not defined $output or ref $output; # or not_valid_pathname($output)
            my($to_stdout, $fh) = $output eq '-';
            if ($to_stdout) {
                open($fh, ">$output") or confess("\nERROR: $!");
            } else {
                (open($fh, ">$output") and flock($fh, 2)) or
                confess("\nERROR: $output: can't create and lock Rlist-file: $!");
            }

            # Build file header.  Compile $data to file $fh, return undef.  

            my $host = eval { use Sys::Hostname; hostname; } || 'some unknown machine';
            my $uid = getlogin || getpwuid($<);
            my $tm = localtime;
            my $prec; $prec = $options->{precision} if ref $options and defined $options->{precision};
            my $eol = $/; $eol = $options->{eol_space} if ref $options and defined $options->{eol_space};
            my @header = 
            map { (length) ? "# $_\n" : "#\n" }
            (($to_stdout ? () : 
              ("-*-rlist-generic-*-", "", $output, "",
               "Created $tm on <$host> by user <$uid>.",
               "Random Lists (Rlist) file (see Data::Rlist on CPAN and <http://www.visualco.de>).")),
             ((defined $prec) ? 
              sprintf('Numerical precision: fixed-point, rounded to %d decimal places.', $prec) :
              sprintf('Numerical precision: floating-point.')),
             "Compile options: $optname.", 
             ($header ? ("", @$header) : ("")));
            print $fh @header, $eol;

            unless ($fast || $perl) {
                $result = 1 if compile($data, $options, $fh);
            } else {
                # Note that we return $Data::Rlist::R here.
                $result = 1;
                print $fh ${compile_fast($data)}.$eol if $fast;
                print $fh ${compile_Perl($data)}.$eol if $perl;
            } close $fh;
        } else {
            # Compile $data into string and return a reference.  Here $output has to be undef or a
            # string-ref (buffer).
            confess $output unless not defined $output or ref $output eq 'SCALAR';
            unless ($fast || $perl) {
                $result = compile($data, $options);
                $output = $result if ref $output;
            } else {
                $result = compile_fast($data) if $fast;
                $result = compile_Perl($data) if $perl;
                $$output = $$result if ref $output; # copy it -> $result is $Data::Rlist::R
            }
        } return $result;
    }
}

sub write_csv($;$$$$);
sub write_csv($;$$$$)
{
    my($data, $output) = (shift, shift);
    my($options, $columns, $header) = @_;
    return 0 unless defined $data;

    if (ref($data) eq __PACKAGE__) {
        $data->dock(sub {
						$output ||= $data->get('-output');
						$options ||= $data->get('-options');
						$columns ||= $data->get('-columns');
						$header ||= $data->get('-header');
						Data::Rlist::write_csv($data->get('-data'), $output, $options, $columns, $header) });
    } else {
        # $data is anything.  In case of undef returns 0.  When the file could not be created,
        # dies. Otherwise returns 1.
        #
        # Unless a value looks like a number the value is quoted (strings may have commas).
        # read_csv uses split_quoted which keeps quotes and backslashes, then maybe_unquote7()s
        # each value.

        $options = complete_options($options, 'default');
        my $to_string = ref $output || not defined $output;
        my($separator, $prec, $auto_quote) = map { $options->{$_} } qw/separator precision auto_quote/;
        my $eol = $/; $eol = $options->{eol_space} if ref $options and defined $options->{eol_space}; $eol ||= "\n";
        my $result = '';
        $auto_quote = 0 unless defined $auto_quote;
        $result.= join($separator, @$columns).$eol if $columns;
        $result.= join($eol, map {
            join($separator, map { is_number($_)
                                   ? (defined($prec) ? round($_, $prec) : $_)
                                   : ($auto_quote ? maybe_quote7($_) : $_)
                               } @$_) } @$data).$eol if @$data;

        if ($to_string) {
            if (ref $output) {
                $$output = $result; return $output
            } else {
                return \$result;
            }
        } else {
            my($to_stdout, $fh) = ($output eq '-');
            local $| = 1 if $DEBUG;
            if ($to_stdout) {
                open($fh, ">$output") or confess("\nERROR: $!");
            } else {
                (open($fh, ">$output") and flock($fh, 2)) or
                confess("\nERROR: $output: can't create and lock CSV-file: $!");
            }
            print $fh $result;
            close $fh; 1
        }
    }
}

sub write_conf($;$$$$)
{
    my($data, $output, $options, $header) = @_;
    $options ||= $data->get('options') if ref($data) eq __PACKAGE__;
    my $have_sep = ref($options) && defined $options->{separator};
    $options = complete_options($options) unless ref $options;
    $options->{separator} = $DefaultConfSeparator unless $have_sep;
    return write_csv($data, $output, $options, $header);
}

sub write_string($;$) {
    my($data, $options) = (shift, shift||'string');
    my $strref;
    if (ref($data) eq __PACKAGE__) {
        my $out = $data->get('output');
        $data->set(-output => undef);
        $strref = Data::Rlist::write($data, undef, $options);
        $data->set(-output => $out);
    } else {
        $strref = Data::Rlist::write($data, undef, $options);
    } return $strref;
}

sub write_string_value($;$) {
    my($data, $options) = (shift, shift||'default');
    local $MaxDepth = $DefaultMaxDepth if $MaxDepth == 0;
    return ${Data::Rlist::write_string($data, $options)};
}

sub keelhaul($;$) {
    my($data, $options) = (shift, shift);
    carp 'Cannot keelhaul Perl data' if defined $options and $options eq 'perl'; # TODO: eval back
    $options ||= complete_options({ precision => undef }, 'squeezed');
    my $strref = Data::Rlist::write_string($data, $options);
    local $MaxDepth = $DefaultMaxDepth if $MaxDepth == 0;
    my $deep_copy = read_string($strref);
    return wantarray ? ($deep_copy, $strref) : $deep_copy;
}

=head2 Implementation Functions

=over

=item F<open_input(INPUT[, FILTER, FILTER-ARGS])>

=item F<close_input>

Open/close  Rlist text  file  or string  INPUT for  parsing.   Used internally  by F<L</read>>  and
F<L</read_csv>>.

B<PREPROCESSING>

The function  can preprocess the INPUT  file using FILTER.  Use  the special value 1  to select the
default  C preprocessor  (F<gcc  -E -Wp,-C>).   FILTER-ARGS  is an  optional  string of  additional
command-line arguments to be appended to FILTER.  For example,

    my $foo = Data::Rlist::read("foo", 1, "-DEXTRA")

eventually does not parse F<foo>, but the output of the command

    gcc -E -Wp,-C -DEXTRA foo

Hence within F<foo> now C-preprocessor-statements are allowed. For example,

    {
    #ifdef EXTRA
    #include "extra.rlist"
    #endif

        123 = (1, 2, 3);
        foobar = {
            .
            .

B<SAFE CPP MODE>

This mode uses F<sed> and a  temporary file.  It is enabled by setting F<$Data::Rlist::SafeCppMode>
to 1  (the default is  0).  It  protects single-line F<#>-comments  when FILTER begins  with either
F<gcc>, F<g++>  or F<cpp>.  F<L</open_input>>  then additionally runs  F<sed> to convert  all input
lines beginning  with whitespace plus the  F<#> character.  Only the  following F<cpp>-commands are
excluded, and only when they appear in column 1:

- F<#include> and F<#pragma>

- F<#define> and F<#undef>

- F<#if>, F<#ifdef>, F<#else> and F<#endif>.

For  all other  lines F<sed>  converts F<#>  into  F<##>.  This  prevents the  C preprocessor  from
evaluating them.   Because of Perl's  limited F<open> function,  which isn't able to  dissolve long
pipes, the invocation  of F<sed> requires a temporary  file.  The temporary file is  created in the
same directory  as the input file.   When you only use  F<//> and F</* */>  comments, however, this
read mode is not required.

=cut

sub open_input($;$$)
{
    my($input, $fcmd, $fcmdargs) = @_;
    my($rls, $filename);
    my $rtp = reftype $input;

    carp "\n${\((caller(0))[3])}: filename or scalar-ref required as INPUT" if defined $rtp && $rtp ne 'SCALAR';
    carp "\n${\((caller(0))[3])}: package locked" if $Readstruct;
    $Readstruct = $ReadFh = undef;
    local $| = 1 if $DEBUG;

    if (defined $input) {
        $Readstruct = { };
        unless (ref $input) {
            $Readstruct->{filename} = $input;
            unless ($fcmd) {	# the file is read unfiltered
                unless (open($Readstruct->{fh}, "<$input") && flock($Readstruct->{fh}, 1)) {
                    $Readstruct = undef;
                    pr1nt('ERROR', "input file '$input'", $!);
                }
            } else {			# pipe it through $fcmt
                $fcmd = "gcc -E -Wp,-C -x c++" if $fcmd == 1;
                $fcmd = "$fcmd $fcmdargs" if $fcmdargs;

                if ($SafeCppMode) {
                    if ($fcmd =~ /^(gcc|g\+\+|cpp)/i) {
                        # Filter input with sed:
                        #
                        # (1) Because known #-commands must start at column 1 we first escape all
                        #     indented '#'s into '##'s:
                        #           "(^ +)#" -> '$1\#'
                        # (2) Next we prefix the known commands with a blank, e.g.
                        #           "#if 0" -> " #if 0"
                        # (3) Finally we escape all unknown #-commands at column 1:
                        #           "^#" -> "\#"
                        #
                        # lexln will then reverse the escaping.  Since the builtin open does not
                        # support true pipes, a temporary file receives the output of sed, which is
                        # then preprocessed. The temporary file will be removed in close_input.

                        my($sedfh, $tmpfh);
                        open($sedfh,
							 "sed '".
							 join('; ', ("s/^\\([ \t][ \t]*\\)#/\\1\\\\#/", # many seds don't know \t -> insert literally
										 "s/^#\\(include\\|pragma\\|if\\|ifdef\\|else\\|endif\\|define\\|undef\\)/ #\\1/",
										 "s/^#/\\\\#/")).";' <$input 2>nul |") ||
										 die "\nERROR: input file '$fcmd': $!";
                        my($tmpinput, $i) = (undef, 0);
                        do { $tmpinput = $input.'.tmp'.$i++ } while -e $tmpinput;
                        $Readstruct->{tmpfile} = $input = $tmpinput;
                        open ($tmpfh, ">$input") || die "\nERROR: temporary file '$input': $!";
                        print $tmpfh readline($sedfh);
                        close $tmpfh;
                        close $sedfh;
                    }
                }

                # Open the file $input (or the temporary sed'd file) for preprocessing.

                unless (open($Readstruct->{fh}, "$fcmd $input 2>nul |")) {
                    $Readstruct = undef;
                    pr1nt('ERROR', "preprocessed input '$fcmd $input': $!");
                }
            }

            if (defined $Readstruct) {
                $ReadFh = $Readstruct->{fh};
                $LnArray = undef;
                $Ln = '';
            }
        } else {
            # Input is a string-ref.  It will be split into lines at LF or CR+LF.  But when it has
            # no newlines it is read as one big line.

            carp "cannot preprocess strings" if $fcmd;
            $LnArray = [ split /\r*\n/, $$input ];
            $Ln = '';
        }
    } $Readstruct
}

sub close_input()
{
	close($Readstruct->{fh}) if $Readstruct->{fh};
    if ($Readstruct->{tmpfile}) {
        unlink($Readstruct->{tmpfile}) ||
        croak "\nERROR: could not temporary file '$Readstruct->{tmpfile}': $!";
    }
    $LnArray = $Ln = $Readstruct = undef;
}

=item F<lex()>

Lexical scanner.  Called by F<L</parse>> to split  the current line into tokens.  F<lex> reads F<#>
or F<//> single-line-comment and F</* */> multi-line-comment as regular white-spaces.  Otherwise it
returns tokens according to the following table:

    RESULT      MEANING
    ------      -------
    '{' '}'     Punctuation
    '(' ')'     Punctuation
    ','         Operator
    ';'         Punctuation
    '='         Operator
    'v'         Constant value as number, string, list or hash
    '??'        Error
    undef       EOF

F<lex> appends all here-doc-lines with a newline character. For example,

        <<test1
        a
        b
        test1

is effectively read as C<"a\nb\n">, which is the same value as the equivalent here-doc in Perl has.
So, not all  strings can be encoded as a  here-doc.  For example, it might not  be quite obvious to
many programmers that C<"foo\nbar"> cannot be expressed as here-doc.

=item F<lexln()>

Read the next line of text from the current input.  Return 0 if F<L</at_eof>>, otherwise return 1.

=item F<at_eof()>

Return true if current input file/string is exhausted, false otherwise.

=item F<parse()>

Read Rlist language productions from current input.  This is a fast, non-recursive parser driven by
the  parser  map  F<%Data::Rlist::Rules>, and  fed  by  F<L</lex>>.   It  is called  internally  by
F<L</read>>.   F<parse>  returns an  array-  or  hash-reference, or  F<undef>  in  case of  parsing
F<L</errors>>.

=cut

# Local variables for lex(). Note that since lexical variables are init'd at compile-time, they're
# available in BEGIN blocks.

my $RELexNumber = qr/^($REFloatHere)/;  # number constant
my $RELexSymbol = qr/^($RESymbolHere)/; # symbolic name without quotes
my $RELexQuotedString = qr/^\"((?:\\[nrbftv\"\'\\]|\\[0-7]{3}|[^\"])*)\"/; # quoted string constant
my $RELexQuotedSymbol = qr/^"($RESymbolHere)"/; # symbolic name in quotes
my $RELexPunctuation = qr/^[$REPunctuationCharacter]/;
my $C1;

BEGIN {
    $REIsPunct[$_] = 0 foreach  0..255;
    $REIsPunct[ 61] = 1;            # =
    $REIsPunct[ 44] = 1;            # ,
    $REIsPunct[ 59] = 1;            # ;
    $REIsPunct[123] = 1;            # {
    $REIsPunct[125] = 1;            # }
    $REIsPunct[ 40] = 1;            # (
    $REIsPunct[ 41] = 1;            # )

    $REIsDigit[$_] = 0 foreach  0..255;
    $REIsDigit[$_] = 1 foreach 48.. 57;
    $REIsDigit[43] = $REIsDigit[45] = $REIsDigit[46] = 1;
}

sub lex()
{
    # First reduce leading whitespace and empty lines. Set $C1 to the ASCII code of the first
    # character in the current line $Ln.
    #
    # The Perl \s regex matches  [ \t\n\r\f], but
    #   ($C1 <= 32 && ($C1 == 32 || $C1 == 9 || $C1 == 10 || $C1 == 13 || $C1 == 12))
    # is still more efficient.  However, to make it even faster we use
    #   ($C1 <= 32)

    unless (defined $Ln) {
        return undef unless lexln(); # fetch next $Ln or stop
    }
    NEXTC1:
    unless ($C1 = ord($Ln)) { # ord returns 0 on empty strings
        return undef unless lexln();
        goto NEXTC1;
    }
    if ($C1 <= 32) {
        $Ln =~ s/^\s+//o;
        goto NEXTC1 unless $C1 = ord($Ln);
    }

    # Puncutators = , ; { } ( )

    #if ($Ln =~ $RELexPunctuation) {
    #if ($C1 == 61 || $C1 == 44 || $C1 == 59 || $C1 == 123 || $C1 == 125 || $C1 == 40 || $C1 == 41) {
    if ($REIsPunct[$C1]) {
        $Ln = substr($Ln, 1);
        return chr($C1);
    }

    # Number scalars. C language single/double-precision numbers.  Test if $C1 is a digit, '.', '-'
    # or '+'.

    #if (($C1 >= 48 && $C1 <= 57) || $C1 == 43 || $C1 == 45 || $C1 == 46) {
    if ($REIsDigit[$C1]) {
        if ($Ln =~ s/$RELexNumber//o) {
            push @VStk, $1;
            return 'v';
        } elsif (($C1 == 45 || $C1 == 46) && $Ln =~ s/$RELexSymbol//o) {
            # Symbolic name (unquoted string) beginning with '-' or '.'.
            push @VStk, $1;
            return 'v';
        } else {
            return syntax_error(qq'unrecognized number "$Ln"');
        }
    }

    # String scalars, un/quoted, here-docs.

    if ($C1 == 34) {            # "
        # String scalar, quoted. Removes the quotes and unesacpes the strings (compile adds
        # quotes).

        #if (0) {
            # BUG: the regex engine of perl 5.8.7 (Cygwin) unconditionally exits when it tried to
            # match a large quoted string, e.g. >8000 characters.  perldb provides no hint
            # why. This problem once occurred during intensive testing of this package.

            #if (length($Ln) > 1000) {
                #print STDERR "string len=".length($Ln)." val = \n\n$Ln\n\n" if $DEBUG;

                # TODO: take a precautionary approach because of bug/misbehaviors in Perl's regex
                # engine now (see above). 
            #}
        #}

#         if ($Ln =~ s/$RELexQuotedSymbol//o) { # no escape sequences
#             push @VStk, $1;
#             return 'v';
#         }

        if ($Ln =~ s/$RELexQuotedString//o) { # maybe has escape sequences
            push @VStk, unescape7($1);
            return 'v';
        } else {
            # There was no closing '"' found on this line. To recover from this error (which is
            # hard) we simply continue to fetch lines until EOF, or $RELexQuotedString happens to
            # match.  Then we return '??' instead of 'v'.

            my $Lnprev;
            syntax_error("unterminated quoted string '$Ln'");
            while (1) {
                $Lnprev = $Ln;
                unless (lexln()) {
                    syntax_error("EOF in quoted string"); last;
                }
                $Ln = $Lnprev.$Ln;
                last if $Ln =~ s/$RELexQuotedString//o;
            } return '??';
        }
    } elsif ($C1 == 60) {       # <<HERE
        if ($Ln =~ s/<<([_\w]+)//io) {
            # Fetch lines until $tok appears at top of a line.  Then continues at $rest of original
            # line. If not EOF the next call to lexln() will return the next line after the line
            # that had closed the here-doc.

            my($tok, $rest, @ln, $ok) = ($1, $Ln);
            my $nanoscript = ($tok eq $DefaultNanoscriptToken);
            while ($ok = lexln()) {
                if ($Ln =~ /^$tok\s*$/m) {
                    $Ln = $rest; last;
                } else {
                    push @ln, unescape7($Ln)
                }
            }
            unless ($ok) {
                confess unless at_eof();
                return syntax_error(qq(EOF while reading here-document '$tok'));
            } else {
                push @VStk, join("\n", @ln)."\n"; # add newline to all lines
                return $nanoscript ? 'n' : 'v';
            }
        }
    }

    # Jump over comments. '//' or '#' single-line-comment, '/*' multi-line-comment.

    if ($C1 == 35) {            # '#'
        $Ln = ''; goto NEXTC1;
    } elsif ($C1 == 47) {       # '/'
        if ($Ln =~ /^\/[\*\/]/o) {
            goto NEXTC1 if $Ln =~ s/^\/\*.*\*\/\s*//x;
            if ($Ln =~ /^\/\//o) {
                $Ln = ''; goto NEXTC1;
            }
            while (lexln()) {
                if ($Ln =~ /\*\/(.*)/) {
                    $Ln = $1; goto NEXTC1;
                }
            } return syntax_error(qq(unterminated comment));
        }
    }

    # Must be a symbolic name (unquoted string). Names are printable and hence have no \NNN
    # sequences.  (Finally applies a regex.)

    if ($Ln =~ s/$RELexSymbol//o) {
        push @VStk, $1;
        return 'v';
    }

    # Unrecognized character, e.g. '*', single '<', '\''.

    die "\n".syntax_error(qq(unrecognized character-code $C1).' '.chr($C1));
}

sub at_eof() {
    if ($ReadFh) {
        return eof($ReadFh);
    } elsif (defined $LnArray && $#$LnArray != -1) {
        return 0
    } else {
        return 1                # $LnArray undef'd or empty
    }
}

sub lexln() {
    # Called from lex to parse Rlist files, and from read_csv.

    if ($ReadFh && !eof($ReadFh)) { # eof(undef) and eof(0) are 1
        $Ln = readline($ReadFh); chomp $Ln; # strips $/
        $Ln =~ s/^([ \t]*)\\#/$1#/o if $SafeCppMode;
        #print "$Ln\n";
        return 1;
    } elsif (defined $LnArray && $#$LnArray != -1) {
        # Read from string.
        $Ln = shift @$LnArray;
        return 1;
    }
    $Ln = undef;
    return 0;
}

sub parse()
{
    my($q, $t, $m, $r, $l) = ('');
    $Warnings = $Errors = $MissingInput = $Broken = 0;
    @Messages = @VStk = @NStk = ();

    while (defined($t = lex())) {
        # Push new token, then reduce as many rules as possible from the tail of the queue before
        # fetching more tokens.  Longer rules are matched first.  The constants 2 and 4 are the
        # min./max. lengths of rules in %Rules. When $l (the current length of $m) is <2 no rule
        # can be matched.

        if (1) {
            $q .= $t;
            while (($l = length($q)) >= 2) {
                if ($r = $Rules{substr($q, -4)}) {
                    substr($q, -4) = $r->();
                } elsif ($r = $Rules{substr($q, -3)}) {
                    substr($q, -3) = $r->();
                } elsif ($r = $Rules{substr($q, -2)}) {
                    substr($q, -2) = $r->();
                } else { last }   # fetch another token
            }                     # match another rule
        } else {
            # The above loop is ca. 10% faster than the second, so this one is disabled (although
            # working).  We expect the if(1/0) blocks to be neutralized by the byte-compiler.

            $l = length($q .= $t);
            while ($l >= 2) {
                $l = 4 if $l > 4;
                $m = substr($q, -$l);

                while (1) {			  # TODO: last if $m begins with [=,;})]
                    if ($Rules{$m}) { # can reduce a rule $m
                        printf STDERR "%20s\treducing  $m\n", $q if $DEBUG;
                        substr($q, -$l) = $Rules{$m}->();
                        $l = length $q; last;
                    } else {
                        # $m is not a matching rule.  Cut the first character from $m and try
                        # matching it.
                        #
                        # Note that to uickly remove the first character from a string is
                        # surprisingly hard in Perl. All of the following work:
                        #
                        #   $m = unpack('x1A'.$l, $m)
                        #   $m = substr($m, 1)      # fastest
                        #   substr($m, 0, 1) = ''

                        printf STDERR "%20s\tno rule   $m\n", $q if $DEBUG && $l > 1;
                        last if --$l < 2;
                        $m = substr($m, 1);
                    }
                } last if $Errors;
            }
        }
    }

    # Parser has finished, EOF has been reached (lex had returned undef). The token queue has now
	# been reduced to one token and @VStk only contains its value. The token 'h' (hash) or 'l'
	# (list). Because of the parser map nature it could also be 'v' (value), in which case it shall
	# decay into a hash or list.

    return undef if $Errors;

	print STDERR qq'Data::Rlist::parse() reached EOF with "$q"\n' if $DEBUG;
	if (@VStk == 0) {
		croak STDERR "unexpected, supernumeray tokens after parsing:\n\t$q\n" if $DEBUG && $q;
		$MissingInput = 1;		# empty input or non-existing file
		return undef;
	} else {
		if (@VStk > 1) {
			pr1nt('ERROR', qq'broken input', qq'expected "l" (list) or "h" (hash), not "$q"');
			my @overproduced = map { ref($_) ? $_ : Data::Rlist::quote7($_) } @VStk;
			for (my $i = 0; $i <= $#overproduced; ++$i) {
				warning(sprintf("cancelling overbilled value [%u] %s", $i, $overproduced[$i]));
			}
			print STDERR qq'Data::Rlist::parse() returns undef\n' if $DEBUG;
			return undef;
		} elsif (not defined $VStk[0]) {
			confess				# dto.
		} elsif ($q eq 'v') {
			my $rtp = reftype $VStk[0]; # result type
			unless (defined $rtp) {
				$VStk[0] = { $VStk[0] => undef } # not a reference -> the input is just one scalar
			} elsif ($rtp !~ /(?:HASH|ARRAY)/) {
				confess quote7($VStk[0]) # shall be an array/hash-reference
			}
		}
	}

	print STDERR "Data::Rlist::parse() returns $VStk[0]\n" if $DEBUG;
	return pop @VStk;
}

=item F<compile(DATA[, OPTIONS, FH])>

Build  Rlist text  from DATA:

=over

=item *

Reference-types F<SCALAR>, F<HASH>, F<ARRAY> and F<REF> are compiled into text, whether blessed or
not.

=item *

Reference-types F<CODE> are compiled depending on the L<C<"code_refs">|/Compile Options> setting in
OPTIONS.

=item *

Reference-types F<GLOB> (L<typeglob-refs|/A Short Story  of Typeglobs>), F<IO> and F<FORMAT> (file-
and  directory  handles) cannot  be  dissolved,  and are  compiled  into  the strings  C<"?GLOB?">,
C<"?IO?"> and C<"?FORMAT?">.

=item *

F<undef>'d values in arrays are compiled into the default Rlist C<"">.

=back

When FH is defined compile directly to this file and return 1.  Otherwise build a string and return
a reference  to it.  This is  the compilation function called  when the OPTIONS  argument passed to
F<L</write>> is not omitted, and is not C<"fast"> or C<"perl">.

=item F<compile_fast(DATA)>

Build Rlist  text from  DATA, as  fast as actually possible  with pure Perl:

=over

=item *

Reference-types F<SCALAR>, F<HASH>, F<ARRAY> and F<REF> are compiled into text, whether blessed or
not.

=item *

F<CODE>, F<GLOB>, F<IO> and F<FORMAT> are compiled into the strings C<"?CODE?">, C<"?IO?">,
C<"?GLOB?"> and C<"?FORMAT?">.  

=item *

F<undef>'d values in arrays are compiled into the default Rlist C<"">.

=back

F<L</compile_fast>> is  the default compilation  function. It is  called when you pass  F<undef> or
C<"fast">  in  place of  the  OPTIONS  parameter  (see F<L</write>>,  F<L</write_string>>).   Since
F<L</compile_fast>>  considers no  compile options  it will  not call  code, round  numbers, detect
self-referential data etc.  Also F<L</compile_fast>> always compiles into a unique package variable
to which it returns a reference.

=item F<compile_Perl(DATA)>

Like F<L</compile_fast>>,  but do not compile  Rlist text - compile  DATA into Perl  syntax. It can
then  be F<eval>'d.   This renders  more compact,  and more  exact output  as  L<Data::Dumper>. For
example, only  strings are quoted.  To  enable this compilation  function pass C<"perl"> to  as the
OPTIONS argument, or set the F<-options> attribute of package objects to this string.

=back

=cut

our($Datatype, $K, $V);
our($Outline_data, $Outline_hashes, $Code_refs, $Here_docs, $Auto_quote, $Precision);
our($Eol_space, $Paren_space, $Bol_tabs, $Comma_punct, $Semicolon_punct, $Assign_punct);

sub compile($;$$)
{
    my($data, $result) = shift;
    my $options = complete_options(shift);

    local($Fh, $Depth, $Broken) = (shift, -1, 0);
    local $RoundScientific = 1 if $options->{scientific};
    local($Eol_space, $Paren_space, $Bol_tabs, 
          $Comma_punct, $Semicolon_punct, $Assign_punct) = map { $options->{$_} }
          qw/eol_space paren_space bol_tabs 
             comma_punct semicolon_punct assign_punct/;

    local($Outline_data, $Outline_hashes,
          $Code_refs, $Here_docs, $Auto_quote, $Precision) = map { $options->{$_} }
          qw/outline_data outline_hashes
             code_refs here_docs auto_quote precision/;

    $Eol_space = $/ unless defined $Eol_space;

    return compile1($data) unless $Fh; # return string-reference
    return compile2($data);     # return 1
}

sub comptab($) {
    return '' if $Bol_tabs == 0; # no indentation
    return chr(9) x ($Bol_tabs * ($Depth + $_[0])); # use physical TABs
}

sub compval($) {
    # Compile a scalar value (number or string, but not a reference).
    #
    # TODO: to gain more speed, in compile create a specialized sub depending on globals
    # $Precision, $Here_docs.

    my $v = shift;
    if (defined $v) {
        if ($v !~ $REValue) {
            # Not an identifier, number or quoted string.  Hence $v will be quoted, and maybe as
            # here-doc.
            if ($Here_docs) {
                if ($v =~ /\n.*\n\z/os) {
                    # Here-docs enabled and $v qualifies.  We can write only strings with at least
                    # two LFs as here-docs (although a final LF would be sufficient).  Now find a
                    # token that doesn't interfere with the text: "___", "HERE", "HERE0", "HERE1"
                    # etc.

                    my @ln = split /\n/, $v;
                    my $tok = '___';
                    while (1) {
                        last unless grep { /^$tok/ } @ln;
                        if ($tok =~ /\d\z/) {
                            $tok++
                        } else {
                            $tok = $tok !~ 'HERE' ? 'HERE' : 'HERE0'
                        }
                    } $v = join('', map { "$_\n" } ("<<$tok", (map { escape7($_) } @ln), $tok));
                } else {
                    $v = quote7($v)
                }
            } else {
                $v = quote7($v)
            }
        } elsif (ord($v) != 34) {
            # Not already quoted.  Either $v is a number or a symbolic name.
            if ($Auto_quote) {
                if ($v =~ $REFloat) {
                    $v = round($v, $Precision) if defined $Precision;
                } else {
                    die $v unless $v =~ $RESymbol;
                    $v = qq("$v");
                }
            } elsif (defined $Precision) {
                $v = round($v, $Precision) if $v =~ $REFloat;
            }
        }
    } $v
}

sub compile1($);
sub compile1($)
{
    # Compile Perl data structure $data into some Rlist and return a string reference.

    my $data = shift;
    my($r, $inl, $k, $v);

    if (ref $data) {
        $Datatype = ord reftype $data;
        $Depth++;
        if ($MaxDepth >= 1 && $MaxDepth < $Depth) {
            pr1nt('ERROR', "compile1() broken in deep $data (max-depth = $MaxDepth)") unless $Broken++;
            $r = DEFAULT_VALUE
        } elsif ($Datatype == 65) { # 65 => 'A' => 'ARRAY'
            my $cnt = @$data;
            unless ($cnt) {
                $r = '('.$Paren_space.')';
            } elsif ($Outline_data > 0 && $Outline_data <= $cnt) {
                # List has more than $Outline_data number of configured elements; print each
                # element on a separate line.

                my($pref0, $pref) = (comptab(0), comptab(1));
                $r.= $Eol_space.$pref0.'('.$Eol_space.$pref;

                # BUG: for some strange reason it destroys $data if assigning the result of the
                # recursive compile1() call to $v again.  Perl 5.8.6,
                # cygwin-thread-multi-64int. Solution: assign temporarily to $w.

                my $w;
                foreach $v (@$data) {
                    $w = ${compile1($v)};
                    $r.= $Comma_punct.$Eol_space.$pref if $inl; $inl = 1;
                    $r.= $w;
                }
                $r.= $Eol_space.$pref0.')';
            } else {
                # Print all entries to one line.

                my $w;
                $r.= '('.$Paren_space;
                foreach $v (@$data) {
                    $w = ${compile1($v)};
                    $r.= $Comma_punct if $inl; $inl = 1;
                    $r.= $w;
                }
                $r.= $Paren_space if $inl;
                $r.= ')';
            }
        } elsif ($Datatype == 72) { # 72 => 'H' => 'HASH'
            my @keys = sort keys %$data;
            unless (@keys) {
                $r = '{'.$Paren_space.'}';
            } else {
                my $manykeys = $Outline_data && @keys;
                my($pref0, $pref) = (comptab(0), comptab(1));
                foreach $k (@keys) {
                    $v = $data->{$k};
                    unless ($inl) { # prepare first pair
                        $r.= $Eol_space.$pref0 if $Outline_hashes && $manykeys;
                        $r.= '{'.$Paren_space;
                        $r.= $Eol_space if $manykeys; $inl = 1;
                    }
                    $k = $pref.(($k !~ $REValue) ? quote7($k) : $k);
                    unless (defined($v)) {
                        $r.= $k.$Semicolon_punct.$Eol_space; # value is undef
                    } else {
                        $v = ${compile1($v)};
                        $r.= $k.$Assign_punct.$v.$Semicolon_punct.$Eol_space;
                    }
                }
                $r.= $pref0 if $manykeys;
                $r.= '}';
                $r.= $Eol_space unless $Depth;
            }
        } elsif ($Datatype == 82) { # 82 => 'R' => 'REF'
            $r.= ${compile1($$data)}
        } elsif ($Datatype == 83) { # 83 => 'S' => 'SCALAR'
            $r.= compval($$data);
        } elsif ($Datatype == 67) { # 67 => 'C' => 'CODE'
            $r.= $Code_refs ? ${compile1($data->())} :  '"?CODE?"'
        } else {                # other reference: 'IO', 'GLOB' or 'FORMAT'
            $r.= compval('?'.reftype($data).'?')
        }
        $Depth--;
    } elsif (defined $data) {   # $data is some scalar (not a ref)
        $r = compval($data);
    } else {                    # $data is undefined
        $r = DEFAULT_VALUE
    } \$r;
}

sub compile2($);
sub compile2($)
{
    # Compile Perl data structure $data into some Rlist and directly print into file handle $Fh (do
    # not compile a big string such as compile1() does).
    #
    # WARNING: this must be merely a copy of the compile1() code.

    my $data = shift;
    my($inl, $k, $v);

    if (ref $data) {
        $Datatype = ord reftype $data;
        $Depth++;
        if ($MaxDepth >= 1 && $MaxDepth < $Depth) {
            pr1nt('ERROR', "compile2() broken in deep $data (depth = $Depth, max-depth = $MaxDepth)") unless $Broken++;
            print $Fh "\n", DEFAULT_VALUE;
        } elsif ($Datatype == 65) { # 65 => 'A' => 'ARRAY'
            my $cnt = 1 + $#$data;
            unless ($cnt) {
                print $Fh '('.$Paren_space.')';
            } elsif ($Outline_data > 0 && $Outline_data <= $cnt) {
                # List has more than the number of configured elements; print each element on a
                # separate line.

                my($pref0, $pref) = (comptab(0), comptab(1));
                print $Fh $Eol_space.$pref0.'('.$Eol_space.$pref;
                foreach $v (@$data) {
                    print $Fh $Comma_punct.$Eol_space.$pref if $inl; $inl = 1;
                    compile2($v);
                }
                print $Fh $Eol_space.$pref0.')';
                print $Fh $Eol_space unless $Depth;
            } else {			# print all entries to one line
                print $Fh '('.$Paren_space;
                foreach $v (@$data) {
                    print $Fh $Comma_punct if $inl; $inl = 1;
                    compile2($v);
                }
                print $Fh $Paren_space if $inl;
                print $Fh ')';
            }
        } elsif ($Datatype == 72) { # 72 => 'H' => 'HASH'
            my @keys = sort keys %$data;
            unless( @keys ) {
                print $Fh '{'.$Paren_space.'}';
            } else {
                my $manykeys = $Outline_data && @keys;
                my($pref0, $pref) = (comptab(0), comptab(1));
                foreach $k (@keys) {
                    $v = $data->{$k};
                    unless ($inl) {
                        print $Fh $Eol_space.$pref0 if $Outline_hashes && $manykeys;
                        print $Fh '{'.$Paren_space;
                        print $Fh $Eol_space if $manykeys; $inl = 1;
                    }
                    $k = $pref.(($k !~ $REValue) ? quote7($k) : $k);
                    unless (defined($v)) {
                        print $Fh $k.$Semicolon_punct.$Eol_space; # value is undef
                    } else {
                        print $Fh $k.$Assign_punct;
                        compile2($v);
                        print $Fh $Semicolon_punct.$Eol_space;
                    }
                }
                print $Fh $pref0 if $manykeys;
                print $Fh '}';
                print $Fh $Eol_space unless $Depth;
            }
        } elsif ($Datatype == 82) { # 82 => 'R' => 'REF'
            compile2($$data)
        } elsif ($Datatype == 83) { # 83 => 'S' => 'SCALAR'
            print $Fh compval($$data);
        } elsif ($Datatype == 67) { # 67 => 'C' => 'CODE'
            if ($Code_refs) {
                compile2($data->())
            } else {
                print $Fh '"?CODE?"'
            }
        } else {                # other reference: 'IO', 'GLOB' or 'FORMAT'
            print $Fh compval('?'.reftype($data).'?')
        }
        $Depth--;
    } elsif (defined $data) {   # $data is some scalar (not a ref)
        print $Fh compval($data);
    } else {                    # $data is undefined
        print $Fh DEFAULT_VALUE;
    } 1
}

sub compile_fast($)
{
    my $data = shift;
    $R = ''; $Depth = -1;       # reset result string
    compile_fast1($data); # return a string reference
    return \$R; # reference to the package-variable $Data::Rlist::R
}

sub compile_fast1($);
sub compile_fast1($)
{
    # Undefined values always are compiled into the default Rlist, the empty string.
    #
    # ord() returns 0 when reftype is undef, which it is for scalars.  For any reference, blessed
    # or not, reftype returns "HASH", "ARRAY", "CODE" or "SCALAR".  The $Datatype approach is
    # significantly faster than testing whether ref($data)=~'ARRAY' etc.

    my $data = $_[0];

    if (ref $data) {
        $Datatype = ord reftype $data;
        $Depth++;
        if ($Datatype == 65) {  # 65 => 'A' => 'ARRAY'
            # Open arrays in lines of their own, like we do also with hashes. The approach is fast
            # and compiles legible text.  Lists of lists (matrices) then look nice.

            if (@$data) {
                $R.= chr(10).(chr(9) x $Depth).'(';
                my $in = 0;
                foreach (@$data) {
                    unless ($in) { $in = 1 } else { $R.= ', ' }
                    if (defined) {
                        if (ref) {
                            compile_fast1($_)
                        } else {
                            $R.= $_ !~ $REValue ? quote7($_): $_
                        }
                    } else { $R.= DEFAULT_VALUE }
                } $R.= ')';
            } else { $R .= '()' }
        } elsif ($Datatype == 72) {   # 72 => 'H' => 'HASH'
            if (%$data) {
                my $pref = chr(9) x $Depth;

                # Sorting is slightly slower than
                #       while (($K, $V) = each %$data)
                # but produces nicer results.  Note also that calling is_random_text is generally
                # faster than to always quote.

                $R.= "{\n";
                foreach $K (sort keys %$data) {
                    $V = $data->{$K};
                    $K = quote7($K) if $K !~ $REValue;
                    $R.= $pref.chr(9).$K;
                    if (defined $V) {
                        $R.= ' = ';
                        if (ref $V) {
                            compile_fast1($V);
                        } else {
                            $V = quote7($V) if $V !~ $REValue;
                            $R.= $V;
                        }
                    } $R.= ";\n";
                } $R.= $pref.'}';
            } else {
                $R.= '{}'
            }
        } elsif ($Datatype == 82) { # 82 => 'R' => 'REF'
            compile_fast1($$data)
        } elsif ($Datatype == 83) { # 83 => 'S' => 'SCALAR'
            $R.= ($$data !~ $REValue) ? quote7($$data) : $$data;
        } else {                # other reference: 'CODE', 'IO', 'GLOB' or 'FORMAT'
            $R.= '"?'.reftype($data).'?"'
        }
        $Depth--;
    } elsif (defined $data) {   # number or string
        $R.= ($data !~ $REValue) ? quote7($data) : $data;
    } else {                    # undef
        $R.= DEFAULT_VALUE;
    }
}

sub compile_Perl($)
{
    my $data = shift;
    $R = ''; $Depth = -1;       # reset result string
    compile_Perl1($data);
    return \$R;
}

sub compile_Perl1($);
sub compile_Perl1($)
{
    my $data = $_[0];
    sub __quote7($) {
        my $s = shift;
        return $s if $s =~ /^["']/;
        return quote7($s);
    }

    if (ref $data) {
        $Datatype = ord reftype $data;
        $Depth++;
        if ($Datatype == 65) {
            if (@$data) {
                $R.= chr(10).(chr(9) x $Depth).'[';
                my $in = 0;
                foreach (@$data) {
                    unless ($in) { $in = 1 } else { $R.= ', ' }
                    if (defined) {
                        if (ref) {
                            compile_Perl1($_)
                        } else {
                            $R.= is_number($_) ? $_ : __quote7($_)
                        }
                    } else { $R.= DEFAULT_VALUE }
                } $R.= ']';
            } else { $R .= '[]' }
        } elsif ($Datatype == 72) {
            if (%$data) {
                my $pref = chr(9) x $Depth;
                $R.= "{\n";
                foreach $K (sort keys %$data) {
                    $V = $data->{$K};
                    $K = __quote7($K) unless is_number($K);
                    $R.= $pref.chr(9).$K;
                    if (defined $V) {
                        $R.= ' => ';
                        if (ref $V) {
                            compile_Perl1($V);
                        } else {
                            $V = __quote7($V) unless is_number($V);
                            $R.= $V;
                        }
                    } $R.= ",\n";
                } $R.= $pref.'}';
            } else {
                $R.= '{}'
            }
        } elsif ($Datatype == 82) {
            compile_Perl1($$data)
        } elsif ($Datatype == 83) {
            $R.= is_number($data) ? $$data : __quote7($$data);
        } else {
            $R.= '"?'.reftype($data).'?"'
        }
        $Depth--;
    } elsif (defined $data) {   # number or string
        $R.= is_number($data) ? $data : __quote7($data);
    } else {                    # undef
        $R.= DEFAULT_VALUE;
    }
}

=head2 Auxiliary Functions

The  utility  functions in  this  section  are generally  useful  when  handling stringified  data.
Internally  F<L</quote7>>, F<L</escape7>>,  F<L</is_integer>>  etc. apply  precompiled regexes  and
precomputed    ASCII     tables.     F<L</split_quoted>>    and     F<L</parse_quoted>>    simplify
L</Text::ParseWords>.   F<L</round>>  and F<L</equal>>  are  working  solutions for  floating-point
numbers.   F<L</deep_compare>>  is a  smart  function  to "diff"  two  Perl  variables.  All  these
functions are very fast and mature.

=over

=item F<is_integer(SCALAR-REF)>

Returns  true when  a scalar  looks like  a positive  or negative  integer constant.   The function
applies the compiled regex F<$Data::Rlist::REInteger>.

=item F<is_number(SCALAR-REF)>

Test for strings  that look like numbers. F<is_number>  can be used to test whether  a scalar looks
like  a  integer/float  constant  (numeric  literal).   The function  applies  the  compiled  regex
F<$Data::Rlist::REFloat>.  Note that it doesn't match

- leading or trailing whitespace,

- lexical conventions such as the C<"0b"> (binary), C<"0"> (octal), C<"0x"> (hex) prefix to denote
  a number-base other than decimal, and

- Perls' legible numbers, e.g. F<3.14_15_92>,

- the IEEE 754 notations of Infinite and NaN.

See also

    $ perldoc -q "whether a scalar is a number"

=item F<is_symbol(SCALAR-REF)>

Test for symbolic names.   F<is_symbol> can be used to test whether a  scalar looks like a symbolic
name.   Such strings  need not  to be  quoted.  Rlist  defines symbolic  names as  a superset  of C
identifier names:

    [a-zA-Z_0-9]                    # C/C++ character set for identifiers
    [a-zA-Z_0-9\-/\~:\.@]           # Rlist character set for symbolic names

    [a-zA-Z_][a-zA-Z_0-9]*                  # match C/C++ identifier
    [a-zA-Z_\-/\~:@][a-zA-Z_0-9\-/\~:\.@]*  # match Rlist symbolic name

For example, names such as F<std::foo>, F<msg.warnings>, F<--verbose>, F<calculation-info> need not
be quoted.

=item F<is_value(SCALAR-REF)>

Returns true when a scalar is an integer, a number, a symbolic name or some quoted string.

=item F<is_random_text(SCALAR-REF)>

The opposite of F<L</is_value>>.  Such scalars will be turned into quoted strings by F<L</compile>>
and F<L</compile_fast>>.

=cut

sub is_integer(\$) { ${$_[0]} =~ $REInteger ? 1 : 0 }
sub is_number(\$) { ${$_[0]} =~ $REFloat ? 1 : 0 }
sub is_symbol(\$) { ${$_[0]} =~ $RESymbol ? 1 : 0 }
sub is_value(\$) { ${$_[0]} =~ $REValue ? 1 : 0 }
sub is_random_text(\$) { ${$_[0]} =~ $REValue ? 0 : 1 }

=item F<quote7(TEXT)>

=item F<escape7(TEXT)>

Converts TEXT into 7-bit-ASCII.  All characters not in the set of the 95 printable ASCII characters
are escaped.  The following  ASCII codes will be converted to escaped  octal numbers, i.e. 3 digits
prefixed by a slash:

    0x00 to 0x1F
    0x80 to 0xFF
    " ' \

The  difference  between  the  two  functions  is that  F<quote7>  additionally  places  TEXT  into
double-quotes.    For  example,   F<quote7(qq'"FrE<uuml>her  Mittag\n"')>   returns  C<"\"Fr\374her
Mittag\n\"">, while F<escape7> returns C<\"Fr\374her Mittag\n\">

=item F<maybe_quote7(TEXT)>

Return F<quote7(TEXT)> if  F<L</is_random_text>(TEXT)>; otherwise (TEXT defines a  symbolic name or
number) return TEXT.

=item F<maybe_unquote7(TEXT)>

Return F<unquote7(TEXT)> when TEXT is enclosed by double-quotes; otherwise returns TEXT.

=item F<unquote7(TEXT)>

=item F<unescape7(TEXT)>

Reverses what F<L</quote7>> and F<L</escape7>> did with TEXT.

=item F<unhere(HERE-DOC-STRING[, COLUMNS, FIRSTTAB, DEFAULTTAB])>

Combines  recipes   1.11  and   1.12  from   the  Perl  Cookbook.    HERE-DOC-STRING  shall   be  a
L<here-document|/Numbers,  Strings and  Here-Documents>.   The function  checks  whether each  line
begins with  a common prefix,  and if so,  strips that off.   If no prefix  it takes the  amount of
leading whitespace found the first line and removes that much off each subsequent line.

Unless  COLUMNS  is defined  returns  the  new here-doc-string.  Otherwise,  takes  the string  and
reformats it into  a paragraph having no line  more than COLUMNS characters long.  FIRSTTAB will be
the indent  for the first  line, DEFAULTTAB  the indent for  every subsequent line.  Unless passed,
FIRSTTAB and DEFAULTTAB default to the empty string C<"">.

=cut

our(%g_nonprintables_escaped,   # keys are non-printable ASCII chars, values are escape sequences
    %g_escaped_nonprintables,   # keys are escaped sequences, values are the non-printables
    $REnonprintable,
    $REescape_seq);

BEGIN {
    # Perl should not use/require the same module twice. However, the die below may throw when
    # Rlist.pm is symlinked. (This is a mature package, and we experienced many scenarios with it
    # so far.)  For example, when Rlist.pm is installed locally to ~/bin and ~/bin is in @INC, one
    # can say
    #       use Rlist;
    # to read the package Data::Rlist.  But in order to
    #       use Data::Rlist;
    # as with the regularily installed version (from CPAN), one must create ~/bin/Data/Rlist.pm.
    # If this is a symlink to ~/bin/Rlist.pm the same file might be used twice by perl.

    croak "${\(__FILE__)} used/required twice" if %g_escaped_nonprintables;

    # Tabulate octalization. In previous versions escape7() was implemented so
    #
    #   sub _octl {
    #       $n = ord($1);
    #       '\\'.($n >> 6).(($n >> 3) & 7).($n & 7);
    #   }
    #   s/([\x00-\x1F\x80-\xFF])/_octl()/ge # non-printables => \NNN
    #
    # which has now been optimized into
    #
    #   s/$REnonprintable/$g_nonprintables_escaped{$1}/go

    sub escape_char($) {
        my $c = ord($_[0]);						 # get number code, eg. 'ü' => 252
        '\\'.($c >> 6).(($c >> 3) & 7).($c & 7); # eg. 252 => \374
    }

    sub unescape_char($) {      # w/o leading backslash
        pack('C', oct($_[0]));  # deoctalize eg. 11 => 9 => \t
    }

    $REescape_seq = qr/\\([0-7]{1,3}|[nrt"'\\])/;
    $REnonprintable = qr/([\x00-\x1F\x80-\xFF"'])/;

    # Build tables for non-printable ASCII chararacters.

    %g_nonprintables_escaped = map { chr($_) => escape_char(chr($_)) } (0x00..0x1F, 0x80..0xFF);
    my @v = values %g_nonprintables_escaped;
    foreach (@v) {
        s/^\\// or die;
        croak $_ if exists $g_escaped_nonprintables{$_};
        $g_escaped_nonprintables{$_} = unescape_char($_)
    }

    croak unless keys(%g_nonprintables_escaped) == (255 - 95);
    croak join("  ", keys %g_escaped_nonprintables) unless keys(%g_escaped_nonprintables) == (255 - 95);
    #croak sort keys %g_escaped_nonprintables;

    # Add \ " ' into the tables, which spares another s// call in escape and unescape for them. The
    # leading \ is alredy matched by $REescape_seq.

    $g_nonprintables_escaped{chr(34)} = qq(\\"); # " => \"
    $g_nonprintables_escaped{chr(39)} = qq(\\'); # ' => \'

    $g_escaped_nonprintables{chr(34)} = chr(34);
    $g_escaped_nonprintables{chr(39)} = chr(39);
    $g_escaped_nonprintables{chr(92)} = chr(92);

    # Add \r, \n and \t.

    if (1) {
        $g_nonprintables_escaped{chr( 9)} = qq(\\t); # \t => \\t
        $g_nonprintables_escaped{chr(10)} = qq(\\n); # \n => \\n
        $g_nonprintables_escaped{chr(13)} = qq(\\r); # \r => \\r

        $g_escaped_nonprintables{'t'} = chr( 9);
        $g_escaped_nonprintables{'n'} = chr(10);
        $g_escaped_nonprintables{'r'} = chr(13);
    }
}

sub maybe_quote7($) { is_random_text($_[0]) ? quote7($_[0]) : $_[0] }
sub maybe_unquote7($) { ord($_[0]) == 34 ? unquote7($_[0]) : $_[0] }
sub quote7($) {
    # Escape, then add quotes. Note that the below expression is faster than qq.
    '"'.escape7($_[0]).'"'
}

sub unquote7($) {
    # First remove quotes, then unescape. The below expression might look complicated; but it is
    # faster than to shift the string and apply s/^\"// and s/\"$// on it.
    unescape7(ord($_[0]) == 34 ? substr($_[0], 1, length($_[0]) - 2) : $_[0])
}

sub escape7($) {
    my $s = shift; return '' unless defined $s;
    $s =~ s/\\/\\\\/g;                                        # has to happen first, because...
    $s =~ s/$REnonprintable/$g_nonprintables_escaped{$1}/gos; # ...this will intersperse more backslashes
    $s
}

sub unescape7($) {
    my $s = shift;
    $s =~ s/$REescape_seq/$g_escaped_nonprintables{$1}/gos;
    $s
}

sub unhere($;$$$) {
    # Combines recipes 1.11 and 1.12.
    local $_ = shift;
    my($white, $leader);        # common whitespace and common leading string
    if (/^\s*(?:([^\w\s]+)(\s*).*\n)(?:\s*\1\2?.*\n)+$/) {
        ($white, $leader) = ($2, quotemeta($1));
    } else {
        ($white, $leader) = (/^(\s+)/, '');
    }
    s/^\s*?$leader(?:$white)?//gm;

    # This is recipe 1.12
    my($columns, $firsttab, $deftab) = (shift, shift||'', shift||'');
    if ($columns) {
        use Text::Wrap;
        $Text::Wrap::columns = $columns;
        return wrap($firsttab, $deftab, $_);
    } else {
        return $_;
    }
}

=item F<split_quoted(INPUT[, DELIMITER])>

=item F<parse_quoted(INPUT[, DELIMITER])>

Divide the string INPUT into a list of strings.  DELIMITER is a regular expression specifying where
to split (default: C<'\s+'>).  The functions won't  split at DELIMITERs inside quotes, or which are
backslashed.

F<parse_quoted> works like F<split_quoted> but  additionally removes all quotes and backslashes
from   the   splitted   fields.    Both   functions   effectively   simplify   the   interface   of
F<Text::ParseWords>.  In an array context they return  a list of substrings, otherwise the count of
substrings.    An  empty   array   is  returned   in   case  of   unbalanced  double-quotes,   e.g.
F<split_quoted(C<'foo,"bar'>)>.

B<EXAMPLES>

    sub split_and_list($) {
        print ($i++, " '$_'\n") foreach split_quoted(shift)
    }

    split_and_list(q("fee foo" bar))

        0 '"fee foo"'
        1 'bar'

    split_and_list(q("fee foo"\ bar))

        0 '"fee foo"\ bar'

The  default   DELIMITER  C<'\s+'>  handles   newlines.   F<split_quoted(C<"foo\nbar\n">)>  returns
S<F<('foo', 'bar',  '')>> and hence can  be used to to  split a large string  of unF<chomp>'d input
lines into words:

    split_and_list("foo  \r\n bar\n")

        0 'foo'
        1 'bar'
        2 ''

The DELIMITER matches everywhere outside of quoted constructs, so in case of the default C<'\s+'>
you may want to remove heading/trailing whitespace. Consider

    split_and_list("\nfoo")
    split_and_list("\tfoo")

        0 ''
        1 'foo'

and

    split_and_list(" foo ")

        0 ''
        1 'foo'
        2 ''

F<parse_quoted> additionally removes all quotes and backslashes from the splitted fields:

    sub parse_and_list($) {
        print ($i++, " '$_'\n") foreach parse_quoted(shift)
    }

    parse_and_list(q("fee foo" bar))

        0 'fee foo'
        1 'bar'

    parse_and_list(q("fee foo"\ bar))

        0 'fee foo bar'

B<MORE EXAMPLES>

String C<'field\ one  "field\ two"'>:

    ('field\ one', '"field\ two"')  # split_quoted
    ('field one', 'field two')      # parse_quoted

String C<'field\,one, field", two"'> with a DELIMITER of C<'\s*,\s*'>:

    ('field\,one', 'field", two"')  # split_quoted
    ('field,one', 'field, two')     # parse_quoted

Split a large string F<$soup> (mnemonic: slurped from a file) into lines, at LF or CR+LF:

    @lines = split_quoted($soup, '\r*\n');

Then transform all F<@lines> by correctly splitting each line into "naked" values:

    @table = map { [ parse_quoted($_, '\s*,\s') ] } @lines

Here is some more complete code to parse a F<.csv>-file with quoted fields, escaped commas:

    open my $fh, "foo.csv" or die $!;
    local $/;                   # enable localized slurp mode
    my $content = <$fh>;        # slurp whole file at once
    close $fh;
    my @lines = split_quoted($content, '\r*\n');
    die q(unbalanced " in input) unless @lines;
    my @table = map { [ map { parse_quoted($_, '\s*,\s') } ] } @lines

In  core  this  is  what  F<L</read_csv>>  does.   F<L</deep_compare>>  allows  you  to  test  what
F<L</split_quoted>> and  F<L</parse_quoted>> return.  For  example, the following code  shall never
die:

    croak if deep_compare([split_quoted("fee fie foo")], ['fee', 'fie', 'foo']);
    croak if deep_compare( parse_quoted('"fee fie foo"'), 1);

=cut

sub split_quoted($;$) {
    # Split [0] at delimiter [1], returning a list of words/tokens.  Delimiter defaults to '\s+'.
    #
    # We've to map the result of parse_line again to build the result. For "foo\nbar\n" parse_line
    # returns ('foo','bar',undef), not ('foo','bar',''). This may cause hard to track "Use of
    # uninitialized value..."  warnings.

    use Text::ParseWords;
    return map { (defined) ? $_ : '' } parse_line($_[1]||'[\s]+', 1, $_[0])
}

sub parse_quoted($;$) {
    use Text::ParseWords;
    return map { (defined) ? $_ : '' } parse_line($_[1]||'[\s]+', 0, $_[0])
}

=item F<equal(NUM1, NUM2[, PRECISION])>

F<L</equal>>  returns true  if  NUM1 and  NUM2  are equal  to PRECISION  number  of decimal  places
(default: 6).  For details see F<L</round>>.

=item F<round(NUM1[, PRECISION])>

Compare and round floating-point numbers NUM1 and NUM2 (as string- or number scalars).

When the C<"precision"> compile option is defined, F<L</round>> is called during compilation on all
numbers.

Normally  F<round>  will  return  a  number  in  fixed-point  notation.   When  the  package-global
F<$Data::Rlist::RoundScientific> is true, however, F<round>  formats the number in either normal or
exponential (scientific) notation,  whichever is more appropriate for  its magnitude.  This differs
slightly from fixed-point notation  in that insignificant zeroes to the right  of the decimal point
are  not included.  Also,  the  decimal point  is  not included  on  whole  numbers.  For  example,
F<L</round>(42)> does not return 42.000000, and F<round(0.12)> returns 0.12, not 0.120000.

B<MACHINE ACCURACY>

One needs a function like F<equal> to compare floats, because IEEE 754 single- and double precision
implementations are  not absolute -  in contrast  to the numbers  they actually represent.   In all
machines  non-integer numbers are  only an  approximation to  the numeric  truth.  In  other words,
they're not commutative.  For  example, given two floats F<a> and F<b>,  the result of F<a+b> might
be different than that of F<b+a>.  For another example, it is a mathematical truth that F<a * b = b
* a>, but not necessarily in a computer.

Each machine has its own accuracy, called the F<machine epsilon>, which is the difference between 1
and the smallest exactly representable number greater than one. Most of the time only floats can be
compared that have been carried out to a  certain number of decimal places.  In general this is the
case when  two floats that result  from a numeric operation  are compared - but  not two constants.
(Constants are accurate through to lexical conventions of the language. The Perl and C syntaxes for
numbers simply won't allow you to write down inaccurate numbers.)

See also recipes 2.2 and 2.3 in the Perl Cookbook.

B<EXAMPLES>

    CALL                    RETURNS NUMBER
    ----                    --------------
    round('0.9957', 3)       0.996
    round(42, 2)             42
    round(0.12)              0.120000
    round(0.99, 2)           0.99
    round(0.991, 2)          0.99
    round(0.99, 1)           1.0
    round(1.096, 2)          1.10
    round(+.99950678)        0.999510
    round(-.00057260)       -0.000573
    round(-1.6804e-6)       -0.000002

=cut

sub equal($$;$) {
    my($a, $b, $prec) = @_;
    $prec = 6 unless defined $prec;
    sprintf("%.${prec}g", $a) eq sprintf("%.${prec}g", $b)
}

sub round($;$) {
    # Note that sprintf("%.6g\n", 2006073104) yields 2.00607e+09, which looses digits.
    my $a = shift; return $a if is_integer($a);
    my $prec = shift; $prec = 6 unless defined $prec;
    return sprintf("%.${prec}g", $a) if $RoundScientific;
    return sprintf("%.${prec}f", $a);
}

=item F<deep_compare(A, B[, PRECISION, TRACE_FLAG])>

Compare and  analyze two numbers, strings or  references.  Generates a list  of messages describing
exactly all unequal data.  Hence, for any Perl data F<$a> and F<$b> one can assert:

    croak "$a differs from $b" if deep_compare($a, $b);

When PRECISION is defined all numbers in A and B are F<L</round>>'d before actually comparing them.
When TRACE_FLAG is true traces progress.

B<RESULT>

Returns an array of messages, each describing unequal data, or data that cannot be compared because
of type- or value-mismatching.  The array is empty when deep comparison of A and B found no unequal
numbers or strings, and only indifferent types.

B<EXAMPLES>

The  result is line-oriented,  and for  each mismatch  it returns  a single  message. For  a simple
example,

    Data::Rlist::deep_compare(undef, 1)

yields

    <<undef>> cmp <<1>>   stop! 1st undefined, 2nd defined (1)

=cut

sub deep_compare($$;$$$);
sub deep_compare($$;$$$)
{
    use Scalar::Util qw/reftype blessed looks_like_number/;

    sub prind($@) { my $ind = shift||0; print STDERR chr(9) x $ind, join(' ', grep { defined } @_), chr(10) }
    #sub quot($) { my $s = shift; $s =~ s/([\n\r\t])/\\&ord($1)/ge; "'$s'" }
    sub quot($) { my $s = shift; defined($s) ? "'$s'" : 'undef' }

    my(@R);
    my($a, $b, $prec, $dump, $ind) = @_;
    my($atp, $btp) = (reftype($a), reftype($b)); # undef, SCALAR, ARRAY or HASH
    my($anm, $bnm, $refs) = (0, 0, defined($atp));
    my $prefix = sub { quot($a).($anm ? ' == ' : ' cmp ').quot($b) };
    my($mismatch, $match) = sub { # use "lazy instantiation", so that this sub isn't compiled for
                                  # the majority of cases (when two values are equal)
        my $s = shift; eval 'push @R, $prefix->()."\tStop! ".$s; prind($ind, $R[$#R]) if $dump;'
    };
    $match = sub { my $s = shift; eval 'prind($ind, $prefix->(), $s)' } if $dump;
    $ind ||= 0;

    unless ($refs) {            # unless $a is a reference
        unless (defined $a) {
            $atp = 'undef';
            if (defined $b) {
                $mismatch->('only 2nd defined');
            } else {
                $match->() if $dump; # both undef'd
            } return @R;
        } else {
            unless (defined $b) {
                $mismatch->('only 1st defined');
                return @R;
            }
            $atp = ($anm = is_number($a)) ? 'number' : 'string';
            $a = round($a, $prec) if $anm and defined $prec;
        }
    }
    unless (defined $btp) {
        unless (defined $b) {
            $btp = 'undef';
            if (defined $a) {
                $mismatch->('only 1st defined');
            } else {
                $match->() if $dump; # both undef'd
            } return @R;
        } else {
            unless (defined $a) {
                $mismatch->('only 2nd defined');
                return @R;
            }
            $btp = ($bnm = is_number($b)) ? 'number' : 'string';
            $b = round($b, $prec) if $bnm and defined $prec;
        }
    }
    #die unless defined $a && defined $b;
    if ($atp ne $btp) {
        $mismatch->("type-mismatch, $atp vs. $btp");
        return @R;
    }

    # At this point $a and $b have equal types.
    unless ($refs) {			# compare numbers/strings
        if ($anm) {
            $prec = (defined $prec) ? " precision=$prec" : '';
            unless (equal($a, $b)) {
                $mismatch->($prec)
            } elsif ($dump) {
                $match->($prec)
            }
        } elsif ($a ne $b) {
            $mismatch->('unequal strings')
        } elsif ($dump) {
            $match->()
        } return @R
    } else {					# deep-compare two references
        my $recurse = sub($$) { deep_compare($_[0], $_[1], $prec, $dump, $ind + 1) };
        prind($ind, $prefix->()) if $dump;
        if ($atp eq 'SCALAR') {	# two scalars refs
            push @R, $recurse->($$a, $$b);
            return @R
        } elsif ($atp eq 'HASH') { # two hashes
            my $acnt = keys %$a;
            my $bcnt = keys %$b;
            unless ($acnt == $bcnt) {
                $mismatch->("different number of keys ($acnt, $bcnt)");
                return @R;
            } return @R if $acnt == 0; # no keys

            # Although both hashes have an equal number of keys, make sure that the keys themselves
            # are equal, and only then compare values.
            my @a_keys_missing = grep { not exists $b->{$_} } keys %$a;
            my @b_keys_missing = grep { not exists $a->{$_} } keys %$b;

            if (@a_keys_missing || @b_keys_missing) {
                $mismatch->('1st hash misses keys ('.join(', ', map { quote7($_) } @a_keys_missing).")") if @a_keys_missing;
                $mismatch->('2nd hash misses keys ('.join(', ', map { quote7($_) } @b_keys_missing).")") if @b_keys_missing;
                return @R;
            }

            foreach (keys %$a) {
                prind($ind, "key '$_'") if $dump;
                push @R, $recurse->($a->{$_}, $b->{$_});
            }
        } elsif ($atp eq 'ARRAY') {	# two arrays
            if ($#$a != $#$b) {
                $mismatch->("different array sizes: ${\(1+$#$a)} vs. ${\(1+$#$b)}")
            } else {
                for (0 .. $#$a) {
                    prind($ind, "index [$_]") if $dump;
                    push (@R, $recurse->($a->[$_], $b->[$_]))
                }
            }
        } elsif ($atp eq 'REF') {
            # Reference to reference.
            $recurse->($$a, $$b)
        } else {
            $mismatch->("cannot compare types $atp");
        }
    } return @R;
}

=item F<fork_and_wait(PROGRAM[, ARGS...])>

Forks a process  and waits for completion.   The function will extract the  exit-code, test whether
the  process died  and prints  status messages  on F<STDERR>.   F<fork_and_wait> hence  is  a handy
wrapper around the built-in F<system> and F<exec> functions.  Returns an array of three values:

    ($exit_code, $failed, $coredump)

F<$exit_code> is -1  when the program failed to  execute (e.g. it wasn't found or  the current user
has insufficient rights).  Otherwise F<$exit_code> is between  0 and 255.  When the program died on
receipt of a signal (like F<SIGINT> or  F<SIGQUIT>) then F<$signal> stores it. When F<$coredump> is
true the program died and a F<core>-file was written.

=item F<synthesize_pathname(TEXT...)>

Concatenates and  forms all  TEXT strings  into a  symbolic name that  can be  used as  a pathname.
F<synthesize_pathname>  is a  useful  function to  concatenate  strings and  nearby converting  all
characters that do  not qualify as filename-characters, into C<"_"> and  C<"-">.  The result cannot
only be used as file- or URL name, but also (coinstantaneously) as hash key, database name etc.

=back

=cut

sub fork_and_wait(@)
{
    my $prog = shift;
    my($exit_code, $signal, $coredump);
    local $| = 1;
    system($prog, @_);          # == 0 or die "\n\tfailed: $?";
    if ($? == -1) {             # not found
        $exit_code = -1;
        print STDERR "\n\tfailed to execute program: $!\n";
    } elsif ($? & 127) {        # died
        $exit_code = -1;
        $signal = ($? & 127);
        $coredump = ($? & 128);
        print STDERR "\n\tchild died with signal %d, %s core-dump\n", $signal, $coredump ? 'with' : 'without';
    } else {                    # ok
        $exit_code = $? >> 8;
        printf STDERR "\n\tchild exited with value %d\n", $exit_code, "\n" if $DEBUG;
    }
    return ($exit_code, $signal, $coredump)
}

sub synthesize_pathname(@)
{
    my @s = @_;
    my($dch1, $dch2) = ('-', '_');
    join('_',
         map {
             # Unquote.
             s/^"(.+)"\z/$1/;
             # Escape all non-printables.
             $_ = escape7($_);
             # Undo \" \'
             s/\\(["'])/$1/go;
             s/[']/_/g;
             s/"(.+)"/$dch2$dch2$1$dch2$dch2/o; # "xxx" within string => __xxx__
             # Handle \NNN
             s/[\\]/0/g; # eg. \347 => 0347
             # Filename
             s/[\(\|\)\/:;]/$dch1/go;            # ( | ) / : ; ==> -
             s/[\^<>:,;\"\$\s\?!\&\%\*]/$dch2/go; # ^ < > " $ ? ! & % * , ; : wsp => _
             s/^[\-\s]+|[\-\s]+\z//o;
             $_
         } @s
        )
}


=head2 Compile Options

The format of the compiled text and the behavior of F<L</compile>> can be controlled by the OPTIONS
parameter of F<L</write>>, F<L</write_string>> etc.  The  argument is a hash defining how the Rlist
text shall be formatted. The following pairs are recognized:

=over

=item 'precision' =E<gt> PLACES

Make F<L</compile>>  round all numbers  to PLACES decimal  places, by calling F<L</round>>  on each
scalar that L<looks  like a number|/is_number>.  By default PLACES is  F<undef>, which means floats
are not rounded.

=item 'scientific' =E<gt> FLAG

Causes F<L</compile>>  to masquerade  F<$Data::Rlist::RoundScientific>.  See F<L</round>>.

=item 'code_refs' =E<gt> TOKEN

Defines  how F<L</compile>>  shall treat  F<CODE> reference.   Legal values  for TOKEN  are  0 (the
default), C<"call"> and C<"deparse">.

- 0 compiles subroutine references into the string C<"?CODE?">.

- C<"call"> calls the code, then compiles the return value.

- C<"deparse"> serializes the code using F<B::Deparse> (reproducing the Perl source).

=item 'threads' =E<gt> COUNT

If enabled F<L</compile>> internally use multiple  threads.  Note that can speedup compilation only
on machines with at least COUNT CPUs.

=item 'here_docs' =E<gt> FLAG

If enabled strings with at least two newlines in them are written as
L<here-document|/Here-Documents>, when possible.  To qualify as here-document a string has to have
at least two LFs (C<"\n">), one of which must terminate it.

=item 'auto_quote' =E<gt> FLAG

When true (default)  do not quote strings that look like  identifiers (see F<L</is_symbol>>).  When
false quote F<all> strings.  Hash keys are not affected.

F<L</write_csv>> and F<L</write_conf>> interpret this flag differently: false means not to quote at
all; true quotes only strings that don't look like numbers and that aren't yet quoted.

=item 'outline_data' =E<gt> NUMBER

When NUMBER is  greater than 0 use C<"eol_space">  (linefeed) to split data to many  lines. It will
insert a linefeed after every NUMBERth array value.

=item 'outline_hashes' =E<gt> FLAG

If enabled, and C<"outline_data"> is also enabled, prints F<{> and F<}> on distinct lines when
compiling Perl hashes with at least one pair.

=item 'separator' =E<gt> STRING

The comma-separator string to be used by F<L</write_csv>>.  The default is C<','>.

=item 'delimiter' =E<gt> REGEX

Field-delimiter for F<L</read_csv>>.  There is no  default value.  To read configuration files, for
example, you may use C<'\s*=\s*'> or C<'\s+'>. To read CSV-files use e.g. C<'\s*[,;]\s*'>.

=back

The following options format the generated Rlist; normally you don't want to modify them:

=over

=item 'bol_tabs' =E<gt> COUNT

Count of physical, horizontal TAB characters to use at the begin-of-line per indentation
level. Defaults to 1. Note that we don't use blanks, because they blow up the size of generated
text without measure.

=item 'eol_space' =E<gt> STRING

End-of-line string to  use (the linefeed).  For  example, legal values are C<"">,  C<" ">, C<"\n">,
C<"\r\n"> etc. The default  is F<undef>, which means to use the current  value of F<$/>.  Note that
this  is  a compile-option  that  only  affects F<L</compile>>.   When  parsing  files the  builtin
F<readline> function is called, which uses F<$/>.

=item 'paren_space' =E<gt> STRING

String to write after F<(> and F<{>, and before F<}> and F<)> when compiling arrays and hashes.

=item 'comma_punct' =E<gt> STRING

=item 'semicolon_punct' =E<gt> STRING

Comma and semicolon strings, which shall be at least C<","> and C<";">.  No matter what,
F<L</compile>> will always print the C<"eol_space"> string after the C<"semicolon_punct"> string.

=item 'assign_punct' =E<gt> STRING

String to make up key/value-pairs. Defaults to C<" = ">.

=back

=head2 Predefined Options

The L<OPTIONS|/Compile Options> parameter accepted by some package functions is either a hash-ref
or the name of a predefined set:

=over

=item 'default'

Default if writing to a file.

=item 'string'

Compact, no newlines/here-docs. Renders a "string of data".

=item 'outlined'

Optimize the compiled Rlist for maximum readability.

=item 'squeezed'

Very compact, no whitespace at all. For very large Rlists.

=item 'perl'

Compile data in Perl syntax, using F<L</compile_Perl>>, not F<L</compile>>.  The output then
can be F<eval>'d, but it cannot be F<L</read>> back.

=item 'fast' or F<undef>

Compile data as fast as possible, using F<L</compile_fast>>, not F<L</compile>>.

=back

All  functions   that  define   an  L<OPTIONS|/Compile  Options>   parameter  do   implicitly  call
F<L</complete_options>> to complete the argument from  one of the predefined sets, and additionally
from C<"default">.   Therefore you can always  define nothing, or  a "lazy subset of  options". For
example,

    my $obj = new Data::Rlist(-data => $thing);

    $obj->write('thing.rls', { scientific => 1, precision => 8 });

=head2 Exports

Example:

    use Data::Rlist qw/:floats :strings/;

=head3 Exporter Tags

=over

=item F<:floats>

Imports F<L</equal>>, F<L</round>> and F<L</is_number>>.

=item F<:strings>

Imports  F<L</maybe_quote7>>,  F<L</quote7>>,  F<L</escape7>>,  F<L</unquote7>>,  F<L</unescape7>>,
F<L</unhere>>, F<L</is_random_text>>,  F<L</is_number>>, F<L</is_symbol>>, F<L</split_quoted>>, and
F<L</parse_quoted>>.

=item F<:options>

Imports F<L</predefined_options>> and F<L</complete_options>>.

=item F<:aux>

Imports F<L</deep_compare>>, F<L</fork_and_wait>> and F<L</synthesize_pathname>>.

=back

=head3 Auto-Exported Functions

The following functions are implicitly imported into the callers symbol table.  (But you may say
F<require Data::Rlist> instead of F<use Data::Rlist> to prohibit auto-import.  See also
L<perlmod>.)

=over

=item F<ReadData(INPUT[, FILTER, FILTER-ARGS])>

=item F<ReadCSV(INPUT[, OPTIONS, FILTER, FILTER-ARGS])>

=item F<ReadConf(INPUT[, OPTIONS, FILTER, FILTER-ARGS])>

These    are   aliases    for   F<Data::Rlist::L</read>>,    F<Data::Rlist::L</read_csv>>   and
F<Data::Rlist::L</read_conf>>.

=item F<EvaluateData(INPUT[, FILTER, FILTER-ARGS])>

Like F<L</ReadData>>  but implicitly call F<Data::Rlist::L</evaluate_nanoscripts>>  in case parsing
was successful.

=item F<WriteData(DATA[, OUTPUT, OPTIONS, HEADER])>

=item F<WriteCSV(DATA[, OUTPUT, OPTIONS, COLUMNS, HEADER])>

=item F<WriteConf(DATA[, OUTPUT, OPTIONS, HEADER])>

These     are    aliases     for     F<Data::Rlist::L</write>>,    F<Data::Rlist::L</write_string>>
F<Data::Rlist::L</write_csv>> and F<Data::Rlist::L</write_conf>>.  OPTIONS default to C<"default">.

=item F<OutlineData(DATA[, OPTIONS])>

=item F<StringizeData(DATA[, OPTIONS])>

=item F<SqueezeData(DATA[, OPTIONS])>

These   are  aliases  for   F<Data::Rlist::L</write_string_value>>.   F<OutlineData>   applies  the
predefined   L<C<"outlined">|/Predefined   Options>   options,   while   F<StringizeData>   applies
L<C<"string">|/Predefined Options> and F<SqueezeData>() L<C<"squeezed">|/Predefined Options>.  When
specified, OPTIONS are merged into the.  For example,

    print "\n\$thing: ", OutlineData($thing, { precision => 12 });

F<L<rounds|/round>> all numbers in F<$thing> to 12 digits.

=item F<PrintData(DATA[, OPTIONS])>

An alias for

    print OutlineData(DATA, OPTIONS);

=item F<KeelhaulData(DATA[, OPTIONS])>

=item F<CompareData(A, B[, PRECISION, TRACE_FLAG])>

These are  aliases for F<L</keelhaul>> and F<L</deep_compare>>. For example,

    use Data::Rlist;
        .
        .
    my($copy, $as_text) = KeelhaulData($thing);

=back

=cut

sub ReadCSV($;$$$) {
    my($input, $options, $fcmd, $fcmdargs) = @_;
    return Data::Rlist::read_csv($input, $options, $fcmd, $fcmdargs);
}

sub ReadConf($;$$$) {
    my($input, $options, $fcmd, $fcmdargs) = @_;
    return Data::Rlist::read_conf($input, $options, $fcmd, $fcmdargs);
}

sub ReadData($;$$) {
    my($input, $fcmd, $fcmdargs) = @_;
    return Data::Rlist::read($input, $fcmd, $fcmdargs);
}

sub EvaluateData($;$$) {
    my($input, $fcmd, $fcmdargs) = @_;
    my $result = ReadData($input, $fcmd, $fcmdargs);
    my $count = Data::Rlist::evaluate_nanoscripts();
    return $result;
}


sub WriteCSV($;$$$$) {
    my($data, $output, $options, $columns, $header) = @_;
    $options ||= 'default';
    Data::Rlist::write_csv($data, $output, $options, $columns, $header);
}

sub WriteConf($;$$$) {
    my($data, $output, $options, $header) = @_;
    $options ||= 'default';
    Data::Rlist::write_conf($data, $output, $options, $header);
}

sub WriteData($;$$$) {
    my($data, $output, $options, $header) = @_;
    $options ||= 'default';     # when undef uses 'default'
    Data::Rlist::write($data, $output, $options, $header);
}

sub PrintData($;$) {            # return outlined data as string-value
    my($data, $options) = @_;
    print OutlineData($data, $options);
}

sub OutlineData($;$) {          # return outlined data as string-ref
    my($data, $options) = @_;
    return Data::Rlist::write_string_value($data, complete_options($options, 'outlined'));
}

sub StringizeData($;$) {        # return data as compact string-ref (no newlines)
    my($data, $options) = @_;
    return Data::Rlist::write_string_value($data, complete_options($options, 'string'));
}

sub SqueezeData($;$) {          # return data as super-compact string-ref (no whitespace at all)
    my($data, $options) = @_;
    return Data::Rlist::write_string_value($data, complete_options($options, 'squeezed'));
}

sub KeelhaulData($;$) {         # recursively copy data
    my($data, $options) = @_;
    return Data::Rlist::keelhaul($data, $options);
}

sub CompareData($$;$$) {        # recursively compare data
    my($a, $b, $prec, $dump) = @_;
    return Data::Rlist::deep_compare($a, $b, $prec, $dump);
}

=head1 EXAMPLES

String- and number values:

    "Hello, World!"
    foo                         # compiles to { 'foo' => undef }
    3.1415                      # compiles to { 3.1415 => undef }

Array values:

    (1, a, 4, "b u z")          # list of numbers/strings

    ((1, 2),
     (3, 4))                    # list of list (4x4 matrix)

    ((1, a, 3, "foo bar"),
     (7, c, 0, ""))             # another list of lists

Here-document strings:

        $hello = ReadData(\<<HELLO)
        ( <<DEUTSCH, <<ENGLISH, <<FRANCAIS, <<CASTELLANO, <<KLINGON, <<BRAINF_CK )
    Hallo Welt!
    DEUTSCH
    Hello World!
    ENGLISH
    Bonjour le monde!
    FRANCAIS
    Ola mundo!
    CASTELLANO
    ~ nuqneH { ~ 'u' ~ nuqneH disp disp } name
    nuqneH
    KLINGON
    ++++++++++[>+++++++>++++++++++>+++>+<<<<-]>++.>+.+++++++
    ..+++.>++.<<+++++++++++++++.>.+++.------.--------.>+.>.
    BRAINF_CK
    HELLO

Compiles F<$hello> as 

    [ "Hallo Welt!\n", "Hello World!\n", "Bonjour le monde!\n", "Ola mundo!\n",
      "~ nuqneH { ~ 'u' ~ nuqneH disp disp } name\n",
      "++++++++++[>+++++++>++++++++++>+++>+<<<<-]>++.>+.+++++++\n..+++.>++.<<+++++++++++++++.>.+++.------.--------.>+.>.\n" ]

Configuration object as hash:

    {
        contribution_quantile = 0.99;
        default_only_mode = Y;
        number_of_runs = 10000;
        number_of_threads = 10;
        # etc.
    }

Altogether:

    Metaphysic-terms =
    {
        Numbers =
        {
            3.141592653589793 = "The ratio of a circle's circumference to its diameter.";
            2.718281828459045 = <<___;
The mathematical constant "e" is the unique real number such that the value of
the derivative (slope of the tangent line) of f(x) = e^x at the point x = 0 is
exactly 1.
___
            42 = "The Answer to Life, the Universe, and Everything.";
        };

        Words =
        {
            ACME = <<Value;
A fancy-free Company [that] Makes Everything: Wile E. Coyote's supplier of equipment and gadgets.
Value
            <<Key = <<Value;
foo bar foobar
Key
[JARGON] A widely used meta-syntactic variable; see foo for etymology.  Probably
originally propagated through DECsystem manuals [...] in 1960s and early 1970s;
confirmed sightings go back to 1972. [...]
Value
        };
    };

=head1 NOTES

The  F<Random Lists> (Rlist)  syntax is  inspired by  NeXTSTEP's F<Property  Lists>.  But  Rlist is
simpler, more readable  and more portable.  The  Perl and C++ implementations are  fast, stable and
free.  Markus Felten,  with whom I worked a few  month in a project at  Deutsche Bank, Frankfurt in
summer 1998,  arrested my attention  on Property lists.   He had implemented  a Perl variant  of it
(F<L<http://search.cpan.org/search?dist=Data-PropertyList>>).

The term "Random" underlines the fact that the language

=over

=item *

has four primitive/anonymuous types;

=item *

the basic building block is a list, which is combined at random with other lists.

=back

Hence the term F<Random> does not mean F<aimless> or F<accidental>.  F<Random Lists> are
F<arbitrary> lists.

=head1 F<Data::Dumper>

The main  difference between F<Data::Dumper>  and F<Data::Rlist> is  that scalars will  be properly
encoded as number or string.  F<Data::Dumper> writes numbers always as quoted strings, for example

    $VAR1 = {
                'configuration' => {
                                    'verbose' => 'Y',
                                    'importance_sampling_loss_quantile' => '0.04',
                                    'distribution_loss_unit' => '100',
                                    'default_only' => 'Y',
                                    'num_threads' => '5',
                                            .
                                            .
                                   }
            };

where F<Data::Rlist> writes

    {
        configuration = {
            verbose = Y;
            importance_sampling_loss_quantile = 0.04;
            distribution_loss_unit = 100;
            default_only = Y;
            num_threads = 5;
                .
                .
        };
    }

As one can  see F<Data::Dumper> writes the data  right in Perl syntax, which means  the dumped text
can be simply F<eval>'d, and the data can  be restored very fast. Rlists are not quite Perl-syntax:
a dedicated parser  is required.  But therefore Rlist  text is portable and can be  read from other
programming languages such as L</C++>.

With  F<$Data::Dumper::Useqq>   enabled  it  was  observed  that   F<Data::Dumper>  renders  output
significantly slower  than F<L</compile>>. This  is actually suprising, since  F<Data::Rlist> tests
for each scalar  whether it is numeric, and truely  quotes/escapes strings.  F<Data::Dumper> quotes
all scalars (including numbers), and it does not  escape strings.  This may also result in some odd
behaviors.  For example,

    use Data::Dumper;
    print Dumper "foo\n";

yields

    $VAR1 = 'foo
    ';

while

    use Data::Rlist;
    PrintData "foo\n"

yields

    { "foo\n"; }

Finally, F<Data::Rlist>  generates smaller files.   With the default F<$Data::Dumper::Indent>  of 2
F<Data::Dumper>'s output  is 4-5  times that of  F<Data::Rlist>'s. This is  because F<Data::Dumper>
recklessly  uses blanks,  instead  of horizontal  tabulators,  which blows  up  file sizes  without
measure.

=head2 Rlist vs. Perl Syntax

Rlists are not Perl syntax:

    RLIST    PERL
    -----    ----
     5;       { 5 => undef }
     "5";     { "5" => undef }
     5=1;     { 5 => 1 }
     {5=1;}   { 5 => 1 }
     (5)      [ 5 ]
     {}       { }
     ;        { }
     ()       [ ]

=head2 Debugging Data

To  reduce recursive data  structures (into  true hierachies)  set F<$Data::Rlist::MaxDepth>  to an
integer above 0.  It then defines the  depth under which F<L</compile>> shall not venture deeper.
The compilation of Perl data (into Rlist text)  then continues, but on F<STDERR> a message like the
following is printed:

    ERROR: compile2() broken in deep ARRAY(0x101aaeec) (depth = 101, max-depth = 100)

This  message will  also be  repeated as  comment when  the compiled  Rlist is  written to  a file.
Furthermore  F<$Data::Rlist::Broken>  is  incremented  by  one. While  the  compilation  continues,
effectively  any  attempt to  venture  deeper as  suggested  by  F<$Data::Rlist::MaxDepth> will  be
blocked.

See F<L</broken>>.

=head2 Speeding up Compilation (Explicit Quoting)

Much work  has been spent to  optimize F<Data::Rlist> for speed.   Still it is  implemented in pure
Perl (no XS).  A rough estimation for Perl 5.8 is "each MB takes one second per GHz".  For example,
when the resulting  Rlist file has a size of 13  MB, compiling it from a Perl  script on a 3-GHz-PC
requires  about 5-7  seconds.   Compiling  the same  data  under Solaris,  on  a sparcv9  processor
operating at 750 MHz, takes about 18-22 seconds.

The process of compiling can be speed up by calling F<L</quote7>> explicitly on scalars. That is,
before calling F<L</write>> or F<L</write_string>>.  Big data sets may compile faster when for
scalars, that certainly not qualify as symbolic name, F<L</quote7>> is called in advance:

    use Data::Rlist qw/:strings/;

    $data{quote7($key)} = $value;
        .
        .
    Data::Rlist::write("data.rlist", \%data);

instead of

    $data{$key} = $value;
        .
        .
    Data::Rlist::write("data.rlist", \%data);

It depends on the case whether  the first variant is faster: F<L</compile>> and F<L</compile_fast>>
both have to call  F<L</is_random_text>> on each scalar.  When the scalar  is already quoted, i.e.,
its first character is C<">, this test ought to run faster.

Internally F<L</is_random_text>> applies the precompiled regex F<$Data::Rlist::REValue>.  Note that
the  expression S<F<($s!~$Data::Rlist::REValue)>>  can  be up  to  20% faster  than the  equivalent
F<is_random_text($s)>.

=head2 Quoting strings that look like numbers

Normally  you  don't  have to  care  about  strings,  since  un/quoting  happens as  required  when
reading/compiling Rlist or CSV  text.  A common problem, however, occurs when  some string uses the
same lexicography than numbers do.

Perl defines  the string as the  basic building block for  all program data, then  lets the program
decide F<what strings mean>.   Analogical, in a printed book the reader  has to decipher the glyphs
and  decide  what evidence  they  hide.   Printed text  uses  well-defined  glyphs and  typographic
conventions, and finally the competence of the reader, to recognize numbers.  But computers need to
know the exact number type and  format.  Integer?  Float?  Hexadecimal?  Scientific?  Klingon?  The
Perl Cookbook recommends the use of a  regular expression to distinguish number from string scalars
(recipe 2.1).

In Rlist,  string scalars  that look  like numbers need  to be  quoted explicitly.   Otherwise, for
example, the  string scalar C<"-3.14"> appears as  F<-3.14> in the output,  C<"007324"> is compiled
into 7324 etc. Such text is lost and read back  as a number.  Of course, in most cases this is just
what you want. For hash keys, however, it might be a problem.  One solution is to prefix the string
with C<"_">:

    my $s = '-9'; $s = "_$s";

Such strings do not qualify as a number anymore.  In the C++ implementation it will then become
some F<std::string>, not a F<double>.  But the leading C<"_"> has to be removed by the reading
program.  Perhaps a better solution is to explicitly call F<L</quote7>>:

    use Data::Rlist qw/:strings/;

    $k = -9;
    $k = quote7($k);            # returns qq'"-9"'

    $k = 3.14_15_92;
    $k = quote7($k);            # returns qq'"3.141592"'

Again, the  need to quote  strings that  look like numbers  is a problem  evident only in  the Perl
implementation of Rlist, since Perl is a  language with weak types.  With the C++ implementation of
Rlist  there's  no  need  to  quote  strings  that  look  like  numbers.

See    also   F<L</write>>,    F<L</is_number>>,   F<L</is_symbol>>,    F<L</is_random_text>>   and
F<L<http://en.wikipedia.org/wiki/American_Standard_Code_for_Information_Interchange>>.

=head2 Installing F<Rlist.pm> locally

Installing CPAN  packages usually requires  administrator privileges.  Another  way is to  copy the
F<Rlist.pm> file  into a directory  of your choice.   Instead of F<use Data::Rlist;>,  however, you
then use the following code.  It will find  F<Rlist.pm> also in F<.> and F<~/bin>, and it calls the
F<Exporter> explicitly:

    BEGIN {
        $0 =~ /[^\/]+$/;
        push @INC, $`||'.', "$ENV{HOME}/bin";
        require Rlist;
        Data::Rlist->import();
        Data::Rlist->import(qw/:floats :strings/);
    }

=head2 An Rlist-Mode for Emacs

    (define-generic-mode 'rlist-generic-mode
       (list "//" ?#)
       nil
       '(;; Punctuators
         ("\\([(){},;?=]\\)" 1 'cperl-array-face)
         ;; Numbers
         ("\\([-+]?[0-9]+\\(\\.[0-9]+\\)?[dDlL]?\\)" 1 'font-lock-constant-face)
         ;; Identifier names
         ("\\([-~A-Za-z_][-~A-Za-z0-9_]+\\)" 1 'font-lock-variable-name-face))
       (list "\\.[rR][lL][iI]?[sS]$")
       ;; Extra functions to setup mode.
       (list 'generic-bracket-support
             '(lambda()
               (require 'cperl-mode)
               ;;(hl-line-mode t)                      ; highlight cursor-line
               (local-set-key [?\t] (lambda()(interactive)(cperl-indent-command)))
               (local-set-key [?\M-q] 'fill-paragraph)
               (set-fill-column 100)))
       "Generic mode for Random Lists (Rlist) files.")

=head2 Implementation Details

=head3 Perl

=head4 Package Dependencies

F<Data::Rlist> depends only on few other packages:

    Exporter
    Carp
    strict
    integer
    Sys::Hostname
    Scalar::Util        # deep_compare() only
    Text::Wrap          # unhere() only
    Text::ParseWords    # split_quoted(), parse_quoted() only

F<Data::Rlist> is free of F<$&>, F<$`> or F<$'>.  Reason: once Perl sees that you need one of these
meta-variables anywhere in the  program, it has to provide them for  every pattern match.  This may
substantially slow your program (see also L<perlre>).

=head4 A Short Story of Typeglobs

This is supplement  information for F<L</compile>>, the function  internally called by F<L</write>>
and   F<L</write_string>>.    We  will   discuss   why   F<L</compile>>,  F<L</compile_fast>>   and
F<L</compile_Perl>>  transliterate  typeglobs  and  typeglob-refs  into C<"?GLOB?">.   This  is  an
attempted explanation.

B<TYPEGLOBS ARE A PERL IDIOSYNCRACY>

Perl uses a symbol table per package to map symbolic names like F<x> to Perl values.  Typeglob (aka
glob) objects are complete symbol table entries,  as hash values.  The symbol table hash (F<stash>)
is an  ordinary hash, named like  the package with two  colons appended.  In the  package stash the
symbol name is mapped to a memory address which  holds the actual data of your program.  In Perl we
do not  have real  global values, only  package globals.  Any  Perl code  is always running  in one
package or another.

The  main symbol  table's name  is F<%main::>,  or F<%::>.   In the  C implementation  of  the Perl
interpreter, the main  symbol is simply a global variable, called  the F<defstash> (default stash).
The  symbol F<Data::>  in stash  F<%::> addresses  the  stash of  package F<Data>,  and the  symbol
F<Rlist::> in the stash F<%::Data::> addresses the stash of package F<Data::Rlist>.

Typeglobs are  an idiosyncracy  of Perl: different  types need  only one stash  entry, so  that one
symbol can name all  types of Perl data (scalars, arrays, hashes)  and nondata (functions, formats,
I/O handles).  The symbol F<x> is mapped to the typeglob F<*x>.  In the typeglob coexist the scalar
F<$x>, the list F<@x>, the hash F<%x>, the code F<&x> and the I/O-handle or format specifieer F<x>.

Most  of the  time only  one glob  slot is  used.  Do  typeglobs waste  space then?   Probably not.
(Although some  authors believe that.)  Other script  languages like (e.g.)  Python  is not forcing
decoration characters  -- the  interpreter already  knows the type.   In terms  of C,  symbol table
entries are then  struct/union-combinations with a type field, a F<double>  field, a F<char*> field
and so forth.   Perl symbols follow a contrary  design: globs are really pointer  sets to low-level
structs that hold numbers, strings etc.  Naturally pointers to non-existing values are NULL, and so
no type  field is required.   Perl interpreters can  now implement fine-grained  smart-pointers for
reference-counting and copy-on-write, and must  not necessarily handle abstract unions.  In theory,
the garbage-collector  should have "increased recycling  opportunities."  We do  know, for example,
that F<perl> is very greedy with RAM: it almost never returns any memory to the operating system.

Modifying F<$x>  in a Perl  program won't  change F<%x>, because  the typeglob F<*x>  is interposed
between the stash and  the program's actual values for F<$x>, F<@x> etc.   The sigil F<*> serves as
wildcard for the other sigils F<%>, F<@>, F<$> and F<&>. (Hint: a F<sigil> is a symbol "created for
a specific magical purpose"; the name derives  from the latin F<sigilum> = seal.)

Typeglobs cannot be dissolved  by F<L</compile>>, because when (e.g.)  F<$x> and  F<%x> are in use,
the glob F<*x> does not return some useful value like

    (SCALAR => \$x, HASH => \@x)

Typeglobs  are  also  not  interpolated  in  strings.   F<perl> always  plays  the  ball  back.   A
typeglob-value is simply a string:

    $ perl -e '$x=1; @x=(1); print *x'
    *main::x

    $ perl -e 'print "*x is not interpolated"'
    *x is not interpolated

    $ perl -e '$x = "this"; print "although ".*x." could be a string"'
    although *main::x could be a string

As one  can see, even when only  F<$x> is defined the  F<*x> does not return  its value.  Typeglobs
(stash entries) are arranged by F<perl> on the fly, even with the F<use strict> pragma in effect:

    $ perl -e 'package nirvana; use strict; print *x'
    *nirvana::x

Each  typeglob is  a full  path into  the F<perl>  stashes, down  from the  F<defstash>:

    $ perl -e 'print "*x is \"*main::x\"" if *x eq "*main::x"'
    *x is "*main::x"

    $ perl -e 'package nirvana; sub f { local *g=shift; print *g."=$g" }; package main; $x=42; nirvana::f(*x)'
    *main::x=42

B<GLOB-REFS>

In the C implementation of Perl, typeglobs have the struct-type F<GV> for "Glob value".  Each F<GV>
is merely a  set of pointers to sub-objects  for scalars, arrays, hashes etc.  In  Perl the special
syntax F<*x{ARRAY}>  accesses the  array-sub-object, and is  another way  to say F<\@x>.   But when
applied to  a typeglob as F<\*foo>  it returns a typeglob-ref,  or globref.  So  the Perl backslash
operator C<\> works like the address-of operator C<&> in C.

    $ perl -e 'print *::'
    *main::main::               # ???

    $ perl -e '$x = 42; print $::{x}'
    *main::x                    # typeglob-value 'x' in the stash

    $ perl -e 'print \*::'
    GLOB(0x10010f08)            # some globref

Little do we know what happens inside F<perl>, when we assign REFs to typeglobs:

    $ perl -e '$x = 42; *x = \$x; print $x'
    42
    $ perl -e '$y = 42; *x = \$y; print $x'
    42

In Perl4 you had to pass typeglob-refs  to call functions by references (the backslash-operator was
not  yet "invented").   Since  Perl5 saw  the  light of  day, typeglob-refs  can  be considered  as
artefacts.  Note, however, that these veterans  are still faster than true references, because true
references  are themselves stored  in a  typeglob (as  REF type)  and so  need to  be dereferenced.
Globrefs can be used directly (as F<GV*>'s) by F<perl>.  For example,

    void f1 { my $bar = shift; ++$$bar }
    void f2 { local *bar = shift; ++$bar }

    f1(\$x);                  # increments $x
    f1(*x);                   # dto., but faster

B<GLOB-ALIASES>

Typeglob-aliases offer  another interesting application for typeglobs.   For example, S<F<*bar=*x>>
aliases the symbol F<bar> in the current stash, so that F<x> and F<bar> point to the same typeglob.
This means  that when  you declare S<F<sub  x {}>> after  casting the  alias, F<bar> is  F<x>.

This smells like  a free lunch.  The penalty,  however, is that the F<bar> symbol  cannot be easily
removed from the stash.   One way is to say F<local *bar>, wich  temporarily assigns a new typeglob
to F<bar> with all pointers zeroized:

    package nirvana;

    sub f { print $bar; }
    sub g { local *bar; $bar = 42; f(); }

    package main;

    nirvana::g();

Running this code as  Perl script prints the number assigned in F<g>.  F<f>  acts as a closure. The
F<local>-statement will  put the  F<bar> symbol temporarily  into the package  stash F<%::nirvana>,
i.e., the same stash in which F<f> and F<g> exist.  It will remove F<bar> when F<g> returns.

B<*foo{THINGS}s>

The F<*x{NAME}> expression family is fondly called "the F<*foo{THING}> syntax":

    $scalarref = *x{SCALAR};
    $arrayref  = *ARGV{ARRAY};
    $hashref   = *ENV{HASH};
    $coderef   = *handlers{CODE};

    $ioref     = *STDIN{IO};
    $ioref     = *STDIN{FILEHANDLE};    # same as *STDIN{IO}

    $globref   = *x{GLOB};
    $globref   = \*x;                   # same as *x{GLOB}
    $undef     = *x{THIS_NAME_IS_NOT_SUPPORTED} # yields undef

    die unless defined *x{SCALAR};      # ok -> will not die
    die unless defined *x{GLOB};        # ok
    die unless defined *x{HASH};        # error -> will die

When THINGs are accessed this way few rules apply.  Firstofall, F<*foo{THING}s> are not hashes. The
syntax is a stopgap:

    $ perl -e 'print \*x, *x{GLOB}, \*x{GLOB}'
    GLOB(0x100110b8)GLOB(0x100110b8)REF(0x1002e944)

    $ perl -e '$x=1; exists *x{GLOB}'
    exists argument is not a HASH or ARRAY element at -e line 1.

Some F<*foo{THING}> is F<undef> if the  requested THING hasn't been used yet.  Only F<*foo{SCALAR}>
returns an anonymous scalar-reference:

    $ perl -e 'print "nope" unless defined *foo{HASH}'
    nope
    $ perl -e 'print *foo{SCALAR}'
    SCALAR(0x1002e94c)

In Perl5 it is still not possible to  get a reference to an I/O-handle (file-, directory- or socket
handle) using  the backslash operator.  When a  function requires an I/O-handle  you must therefore
pass a globref.  More precisely, it is possible to pass an F<IO::Handle>-reference, a typeglob or a
typeglob-ref as the filehandle.  This is obscure bot only for new Perl programmers.

    sub logprint($@) {
        my $fh = shift;
        print $fh map { "$_\n" } @_;
    }

    logprint(*STDOUT{IO}, 'foo');   # pass IO-handle -> IO::Handle=IO(0x10011b44)
    logprint(*STDOUT, 'bar');       # ok, pass typeglob-value -> '*main::STDOUT'
    logprint(\*STDOUT, 'bar');      # ok, pass typeglob-ref -> 'GLOB(0x10011b2c)'
    logprint(\*STDOUT{IO}, 'nope'); # ERROR -> won't accept 'REF(0x10010fe0)'

It is very amusing that Perl, although refactoring  UNIX in form of a language, does not make clear
what a file-  or socket-handle is.  The  global symbol STDOUT is actually  an F<IO::Handle> object,
which F<perl>  had silently  instantiated.  To functions  like F<print>,  however, you may  pass an
F<IO::Handle>, globname or globref.

B<VIOLATING STASHES>

As we saw we can access the Perl guts without using a scalpel.  Suprisingly, it is also possible to
touch the stashes themselves:

    $ perl -e '$x = 42; *x = $x; print *x'
    *main::42

    $ perl -e '$x = 42; *x = $x; print *42'
    *main::42

By assigning the scalar value F<$x> to F<*x> we have demolished the stash (at least, logically):
neither F<$42> nor F<$main::42> are accessible.  Symbols like F<42> are invalid, because 42 is a
numeric literal, not a string literal.

    $ perl -e '$x = 42; *x = $x; print $main::42'

Nevertheless it is easy to confuse F<perl> this way:

    $ perl -e 'print *main::42'
    *main::42

    $ perl -e 'print 1*9'
    9

    $ perl -e 'print *9'
    *main::9

    $ perl -e 'print *42{GLOB}'
    GLOB(0x100110b8)

    $ perl -e '*x = 42; print $::{42}, *x'
    *main::42*main::42

    $ perl -v
    This is perl, v5.8.8 built for cygwin-thread-multi-64int
    (with 8 registered patches, see perl -V for more detail)

Of course these  behaviors are not reliable, and  may disappear in future versions  of F<perl>.  In
German  you  say   "Schmutzeffekt"  (dirt  effect)  for  certain   mechanical  effects  that  occur
non-intendedly,  because machines  and electrical  circuits are  not perfect,  and so  is software.
However, "Schmutzeffekts" are neither bugs nor features; these are phenomenons.

B<LEXICAL VARIABLES>

Lexical variables (F<my> variables) are not stored in stashes, and do not require typeglobs.  These
variables are stored in a special array, the F<scratchpad>, assigned to each block, subroutine, and
thread. These are really private variables, and they cannot be F<local>ized.  Each lexical variable
occupies a  slot in the scratchpad;  hence is addressed by  an integer index, not  a symbol.  F<my>
variables are like F<auto> variables in C.  They're also faster than F<local>s, because they can be
allocated at compile time, not runtime. Therefore you cannot declare F<*x> lexically:

    $ perl -e 'my(*x)'
    Can't declare ref-to-glob cast in "my" at -e line 1, near ");"

Seel also the Perl man-pages L<perlguts>, L<perlref>, L<perldsc> and L<perllol>.

=head3 C++

In C++  we use a  F<flex>/F<bison> scanner/parser combination  to read Rlist  language productions.
The  C++  parser  generates  an   F<Abstract  Syntax  Tree>  (AST)  of  F<double>,  F<std::string>,
F<std::vector> and F<std::map> values.   Since each value is put into the  AST, as separate object,
we use a free store management that allows the allocation of huge amounts of tiny objects.

We also use reference-counted smart-pointers, which allocate themselves on our fast free store.  So
RAM will not be fragmented, and the allocation of RAM is significantly faster than with the default
process heap.   Like with Perl,  Rlist files can  have hundreds of megabytes  of data (!),  and are
processable in constant time, with constant  memory requirements.  For example, a 300 MB Rlist-file
can be read from a C++ process which will not peak over 400-500 MB of process RAM.

=head1 BUGS

There are no known bugs, this package is stable.  Deficiencies and TODOs:

=over

=item *

The C<"deparse"> functionality for the C<"code_refs"> L<compile option|/Compile Options> has not
yet been implemented.

=item *

The C<"threads"> L<compile option|/Compile Options> has not yet been implemented.

=item *

IEEE 754 notations of Infinite and NaN not yet implemented.

=item *

F<L</compile_Perl>> is experimental.

=back

=head1 COPYRIGHT/LICENSE

Copyright 1998-2008 Andreas Spindler

Maintained   at  CPAN   (F<L<http://search.cpan.org/dist/Data-Rlist/>>)  and   the   author's  site
(F<L<http://www.visualco.de>>). Please send mail to F<rlist@visualco.de>.

This library  is free software; you  can redistribute it and/or  modify it under the  same terms as
Perl itself, either Perl version 5.8.8 or, at your option, any later version of Perl 5 you may have
available.

Contact the author for the C++ library at F<rlist@visualco.de>.

Thank you for your attention.

=cut

1;

### Local Variables:
### buffer-file-coding-system: iso-latin-1
### fill-column: 99
### End: