Edward Baudrez > PDL-NDBin-0.008-TRIAL > PDL::NDBin::Actions_PP

Download:
PDL-NDBin-0.008-TRIAL.tar.gz

Annotate this POD

CPAN RT

Open  1
View/Report Bugs
Source   Latest Release: PDL-NDBin-0.015

NAME ^

PDL::NDBin::Actions_PP - XS actions for PDL::NDBin

DESCRIPTION ^

This module contains the implementation of the core loop of the action classes in the PDL::NDBin::Action namespace. The subroutines are not intended to be called by the user.

  1. It's numerically well-behaved: out() is always of the order of magnitude of the values themselves, unlike the sum of the values, which grows very large as the number of elements grows large.
  2. The subtraction in() - out() guarantees that the computation will be done in the correct type (i.e., double instead of the type of the input).
  3. Requires only one temporary.
  4. Requires only one pass over the data.

I used to give the output array type float+, but that creates more problems than it solves. So now, averages are always computed in type double. Besides being the default type in PDL and the 'natural' floating-point type in C, it also makes the implementation easier.], Code => ' loop(n) %{ register int j = $idx(); $out(m => j) += ( $in() - $out(m => j) ) / ++( $count(m => j) ); %} ', BadCode => ' loop(n) %{ if( ($ISBAD(idx())) || ($ISBAD(in())) ) continue; register int j = $idx(); $out(m => j) += ( $in() - $out(m => j) ) / ++( $count(m => j) ); %} ', );

# _istddev_loop # pp_def ( '_istddev_loop', Pars => 'in(n); int idx(n); double [o] out(m); int [t] count(m); double [t] avg(m)', OtherPars => 'int msize => m', HandleBad => 1, Doc => q[Compute the standard deviation of the elements in each bin.

Note that we compute the sample standard deviation, not an estimate of the population standard deviation (which differs by a factor).

Credit for the algorithm goes to ashawley on commandlinefu.com;

        awk '{ delta = $1 - avg; avg += delta / NR; mean2 += delta * ($1 - avg) } END { print sqrt(mean2 / NR) }'

This is a wonderful solution solving many of the problems with more naive implementations:

  1. It's numerically well-behaved.
  2. The subtractions guarantee that the computations will be done in the correct type (i.e., double instead of the type of the input).
  3. Requires only two temporaries (!).
  4. Requires only one pass over the data.

I used to give the output array type float+, but that creates more problems than it solves. So now, standard deviations are always computed in type double. Besides being the default type in PDL and the 'natural' floating-point type in C, it also makes the implementation easier.], Code => ' loop(n) %{ register int j = $idx(); double delta = $in() - $avg(m => j); $avg(m => j) += delta / ++( $count(m => j) ); $out(m => j) += delta * ( $in() - $avg(m => j) ); %} ', BadCode => ' loop(n) %{ if( ($ISBAD(idx())) || ($ISBAD(in())) ) continue; register int j = $idx(); double delta = $in() - $avg(m => j); $avg(m => j) += delta / ++( $count(m => j) ); $out(m => j) += delta * ( $in() - $avg(m => j) ); %} ', );

# _istddev_post # pp_def( '_istddev_post', Pars => 'count(n); [o] out(n)', HandleBad => 1, Doc => 'Finalization for _istddev_loop().', Code => ' int flag = 0; loop(n) %{ if( ! $count() ) { $SETBAD(out()); flag = 1; } else { $out() = sqrt( $out() / $count() ); } %} if( flag ) { $PDLSTATESETBAD(out); } ', BadCode => ' int flag = 0; loop(n) %{ if( ! $count() ) { $SETBAD(out()); flag = 1; } else { $out() = sqrt( $out() / $count() ); } %} if( flag ) { $PDLSTATESETBAD(out); } ', );

# TODO # complete with the other functions that xyz2grd provides # indirect maximum # indirect minimum

pp_addpm( <<'EOD' );

AUTHOR ^

Edward Baudrez <ebaudrez@cpan.org>

COPYRIGHT AND LICENSE ^

This software is copyright (c) 2013 by Edward Baudrez.

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.

syntax highlighting: