The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Bio::Gonzales::Matrix::IO - Library for simple matrix IO

SYNOPSIS

    use Bio::Gonzales::Matrix::IO qw(lspew mslurp lslurp mspew);

DESCRIPTION

Provides functions for common matrix/list IO.

SUBROUTINES

mspew($filename, \@matrix, \%options)
mspew($filehandle, \@matrix, \%options)

Save the values in @matrix to a $filename or $filehandle. @matrix is an array of arrayrefs:

    @matrix = (
        [ l11, l12, l13 ],
        [ l21, l22, l23 ],
        [ l31, l32, l33 ]
    );

Options:

header / ids

Supply a header. Same as

     mspew($file, [ \@header, @matrix ])
row_names

Supply row names or if not an array but true, use the header as row names

    mspew( $file, $matrix, { row_names => 1 } );                            #use header
    mspew( $file, $matrix, { row_names => [ 'row1', '...', 'rown' ] } );    #use supplied row names
fill_missing_cols

If a row has less columns than the longest row of the matrix, fill it up with empty strings.

na_value

Use this value in case undefined values are found. Default is 'NA'.

sep

Set a separator for the output file

square (default 1)

Add empty columns to fill up to a square.

$matrix_ref = mslurp($file, \%config)
($matrix_ref, $header_ref, $row_names_ref) = mslurp($file, \%config)

Reads in the contents of $file and puts it in a array of arrayrefs.

You can set the delimiter via the configuration by supplying { sep => qr/\t/ } as config hash.

Further options with defaults:

    %config = (
        sep => qr/\t/, # set column separator
        header => 0, # parse header
        skip => 0, # skip the first N lines (without header)
        row_names => 0, # parse row names
        comment => qr/^#/ # the comment character
    );
    
lspew($fh_or_filename, $list, $config_options)

spews out a list of values to a file. It can handle filenames and filehandles, but if you supply a handle, you have to close it on your own. The $list can be a

hash ref of array refs

results in keya avalue0 avalue1 keyb bvalue0 bvalue1 ...

hash ref

results in keya valuea keyb valueb ...

array ref

results in value0 value1 ...

$config_options is a hash ref. It can take the options:

    $config_options = {
        delim => "\t",
    };
\%data = hslurp($file, \%config)

Reads the context of a file, splits the input lines acording to mslurp rules and takes the first element as hash reference. The remaining part of the line is stored as array reference under the key (first element). The configuration options are the same as in mslurp.

The default behaviour can be influenced by

  %config = (
    idx_col => 0, # set a different index column as hash key, 0-based
    has_duplicates => 0, # allow duplicate entries in the index column.
    ...
  );

If duplicates are allowed, the result data structure has the form

  %data = (
    "key1" => [
      [ elem_x, elem_y, elem_z ], 
      [ elem_x, elem_y, elem_z ], 
      ...
    ],
    "key2" => [
      [ elem_x, elem_y, elem_z ], 
      [ elem_x, elem_y, elem_z ], 
      ...
    ],
  );

If no duplicates are allowed, %data has the structure

  %data = (
    "key1" => [ elem_x, elem_y, elem_z ], 
    "key2" => [ elem_x, elem_y, elem_z ], 
  );

hslurp dies with stacktrace if it hits a duplicate key in the index column.

SEE ALSO

AUTHOR

jw bargsten, <joachim.bargsten at wur.nl>