Statistics::R - Perl interface with the R statistical program
Statistics::R is a module to controls the R interpreter (R project for statistical computing: http://www.r-project.org/). It lets you start R, pass commands to it and retrieve the output. A shared mode allow to have several instances of Statistics::R talk to the same R process.
The current Statistics::R implementation uses pipes (for stdin, stdout and and stderr) to communicate with R. This implementation should be more efficient and reliable than that in previous version, which relied on reading and writing files. As before, this module works on GNU/Linux, MS Windows and probably many more systems.
use Statistics::R; # Create a communication bridge with R and start R my $R = Statistics::R->new(); # Run simple R commands my $output_file = "file.ps"; $R->run(qq`postscript("$output_file" , horizontal=FALSE , width=500 , height=500 , pointsize=1)`); $R->run(q`plot(c(1, 5, 10), type = "l")`); $R->run(q`dev.off()`); # Pass and retrieve data (scalars or arrays) my $input_value = 1; $R->set('x', $input_value); $R->run(q`y <- x^2`); my $output_value = $R->get('y'); print "y = $output_value\n"; $R->stop();
Build a Statistics::R bridge object between Perl and R. Available options are:
Specify the full path to R if it is not automatically found. See INSTALLATION.
Start a shared bridge. When using a shared bridge, several instances of Statistics::R can communicate with the same unique R instance. Example:
use Statistics::R; my $R1 = Statistics::R->new( shared => 1); my $R2 = Statistics::R->new( shared => 1); $R1->set( 'x', 'pear' ); my $x = $R2->get( 'x' ); print "x = $x\n"; $R1->stop; # or $R2->stop
Note that in shared mode, you are responsible to have one of your Statistics::R instances call the stop() method when you are finished with R. But be careful not to call the stop() method if you still have processes that need to interact with R!
First, start() R if it is not yet running. Then, execute R commands passed as a string and return the output as a string. If your command fails to run in R, an error message will be displayed.
my $out = $R->run( q`print( 1 + 2 )` );
If you intend on runnning many R commands, it may be convenient to pass an array of commands or put multiple commands in an here-doc:
# Array of R commands: my $out1 = $R->run( q`a <- 2`, q`b <- 5`, q`c <- a * b`, q`print("ok")` ); # Here-doc with multiple R commands: my $cmds = <<EOF; a <- 2 b <- 5 c <- a * b print('ok') EOF my $out2 = $R->run($cmds);
To run commands from a file, see the run_from_file() method.
The output you get from run() is the combination of what R would display on the standard output and the standard error, but the order may differ. When loading modules, some may write numerous messages on standard error. You can disable this behavior using the following R command:
Note that R imposes an upper limit on how many characters can be contained on a line: about 4076 bytes maximum. You will be warned if this occurs. Commands containing lines exceeding the limit may fail with an error message stating:
'\0' is an unrecognized escape in character string starting "...
If possible, break down your R code into several smaller, more manageable statements. Alternatively, adding newline characters "\n" at strategic places in the R statements will work around the issue.
Similar to run() but reads the R commands from the specified file. Internally, this method converts the filename to a format compatible with R and then passes it to the R source() command to read the file and execute the commands.
Set the value of an R variable (scalar or vector). Example:
# Create an R scalar $R->set( 'x', 'pear' );
# Create an R list $R->set( 'y', [1, 2, 3] );
Get the value of an R variable (scalar or vector). Example:
# Retrieve an R scalar. $x is a Perl scalar. my $x = $R->get( 'x' );
# Retrieve an R list. $x is a Perl arrayref. my $y = $R->get( 'y' );
Explicitly start R. Most times, you do not need to do that because the first execution of run() or set() will automatically call start().
Stop a running instance of R. Usually, you do not need to do this because stop() is automatically the Statistics::R object goes out of scope.
stop() and start() R.
Get or set the path to the R executable.
Was R started in shared mode?
Is R running?
Return the PID of the running R process
Since Statistics::R relies on R to work, you need to install R first. See this page for downloads, http://www.r-project.org/. If R is in your PATH environment variable, then it should be available from a terminal and be detected automatically by Statistics::R. This means that you don't have to do anything on Linux systems to get Statistics::R working. On Windows systems, in addition to the folders described in PATH, the usual suspects will be checked for the presence of the R binary, e.g. C:\Program Files\R. If Statistics::R does not find R installation, your last recourse is to specify its full path when calling new():
my $R = Statistics::R->new( r_bin => $fullpath );
You also need to have the following CPAN Perl modules installed:
Florent Angly <email@example.com> (2011 rewrite)
Graciliano M. P. <firstname.lastname@example.org> (original code)
Brian Cassidy <email@example.com>
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
All complex software has bugs lurking in it, and this program is no exception. If you find a bug, please report it on the CPAN Tracker of Statistics::R: http://rt.cpan.org/Dist/Display.html?Name=Statistics-R
Bug reports, suggestions and patches are welcome. The Statistics::R code is developed on Github (http://github.com/bricas/statistics-r) and is under Git revision control. To get the latest revision, run:
git clone git://github.com/bricas/statistics-r.git