Casiano Rodriguez-Leon > GRID-Machine-0.127 > GRID::Machine::Process

Download:
GRID-Machine-0.127.tar.gz

Dependencies

Annotate this POD

CPAN RT

Open  0
View/Report Bugs
Source  

NAME ^

GRID::Machine::Process - Forking via SSH

SYNOPSIS ^

     use GRID::Machine;
     
     my $host = $ENV{GRID_REMOTE_MACHINE};
     my $machine = GRID::Machine->new( host => $host );
     
     my ($N, $np, $pi)  = (1000, 4, 0);
     for (0..$np-1) {
        $machine->fork( q{
            my ($id, $N, $np) = @_;
              
            my $sum = 0;
            for (my $i = $id; $i < $N; $i += $np) {
                my $x = ($i + 0.5) / $N;
                $sum += 4 / (1 + $x * $x);
            }
            $sum /= $N; 
         },
         args => [ $_, $N, $np ],
       );
     }
     
     $pi += $machine->waitall()->result for 1..$np;
     
     print "pi = $pi\n";

DESCRIPTION ^

The fork method

The fork method of GRID::Machine objects can be used to fork a process in the remote machine, as shown in the following example:

  $ cat -n fork5.pl 
     1  #!/usr/bin/perl -w
     2  use strict;
     3  use GRID::Machine;
     4  use Data::Dumper;
     5  
     6  my $host = $ENV{GRID_REMOTE_MACHINE};
     7  my $machine = GRID::Machine->new( host => $host );
     8  
     9  my $p = $machine->fork( q{
    10  
    11     print "stdout: Hello from process $$. args = (@_)\n";
    12     print STDERR "stderr: Hello from process $$\n";
    13  
    14     use List::Util qw{sum};
    15     return { s => sum(@_), args => [ @_ ] };
    16   },
    17   args => [ 1..4 ],
    18  );
    19  
    20  # GRID::Machine::Process objects are overloaded
    21  print "Doing something while $p is still alive  ...\n" if $p; 
    22  
    23  my $r = $machine->waitpid($p);
    24  
    25  print "Result from process '$p': ",Dumper($r),"\n";
    26  print "GRID::Machine::Process::Result objects are overloaded in a string context:\n$r\n";

When executed, the former program produces an output similar to this:

  $ perl fork5.pl 
  Doing something while 5220:5230:some.machine:5234:5237 is still alive  ...
  Result from process '5220:5230:some.machine:5234:5237': $VAR1 = bless( {
                   'machineID' => 0,
                   'stderr' => 'stderr: Hello from process 5237
  ',
                   'descriptor' => 'some.machine:5234:5237',
                   'status' => 0,
                   'waitpid' => 5237,
                   'errmsg' => '',
                   'stdout' => 'stdout: Hello from process 5237. args = (1 2 3 4)
  ',
                   'results' => [ { 'args' => [ 1, 2, 3, 4 ], 's' => 10 } ]
                 }, 'GRID::Machine::Process::Result' );

  GRID::Machine::Process::Result objects are overloaded in a string context:
  stdout: Hello from process 5237. args = (1 2 3 4)
  stderr: Hello from process 5237

The fork method returns a GRID::Machine::Process object. The first argument must be a string containing the code that will be executed by the forked process in the remote machine. Such code is always called in a list context. The fork method admits the following arguments:

GRID::Machine::Process objects

GRID::Machine::Process objects have been overloaded. In a string context a GRID::Machine::Process object produces the concatenation hostname:clientPID:remotePID. In a boolean context it returns true if the process is alive and false otherwise. This way, the execution of line 21 in the program above:

    21  print "Doing something while $p is still alive  ...\n" if $p; 

produces an output like:

  Doing something while 5220:5230:some.machine:5234:5237 is still alive  ...

if the remote process is still alive. The descriptor of the process 5220:5230:some.machine:5234:5237 is a colon separated sequence of five components:

1 - The PID of the local process executing GRID::Machine
2 - The PID of the local process in charge of the connection with the remote machine
3 - The name of the remote machine
4 - The PID of the remote process executing GRID::Machine::REMOTE
5 - The PID of the child process created by fork

When evaluated in a boolean context, a GRID::Machine::Process returns 1 if it is alive and 0 otherwise.

The waitpid method

The waitpid method waits for the GRID::Machine::Process received as first argument to terminate. Additional FLAGS as in perl waitpid can be passed as arguments. It returns a GRID::Machine::Process::Result object, whose attributes contain:

The waitall method

It is similar to waitpid but instead waits for any child process.

Behaves like the wait(2) system call on your system: it waits for a child process to terminate and returns

See an example:

  $ cat -n wait1.pl 
     1  #!/usr/bin/perl -w
     2  use strict;
     3  use GRID::Machine;
     4  use Data::Dumper;
     5  
     6  my $host = $ENV{GRID_REMOTE_MACHINE};
     7  my $machine = GRID::Machine->new( host => $host );
     8  
     9  my $p = $machine->fork( q{
    10  
    11     print "stdout: Hello from process $$. args = (@_)\n";
    12     print STDERR "stderr: Hello from process $$\n";
    13  
    14     use List::Util qw{sum};
    15     return { s => sum(@_), args => [ @_ ] };
    16   },
    17   args => [ 1..4 ],
    18  );
    19  
    20  # GRID::Machine::Process objects are overloaded
    21  print "Doing something while $p is still alive  ...\n" if $p; 
    22  
    23  my $r = $machine->waitall();
    24  
    25  print "Result from process '$p': ",Dumper($r),"\n";
    26  print "GRID::Machine::Process::Result objects are overloaded in a string context:\n$r\n";

When executed produces:

  $ perl wait1.pl 
  Doing something while 1271:1280:local:1284:1287 is still alive  ...
  Result from process '1271:1280:local:1284:1287': $VAR1 = bless( {
                   'machineID' => 0,
                   'stderr' => 'stderr: Hello from process 1287
  ',
                   'descriptor' => 'local:1284:1287',
                   'status' => 0,
                   'waitpid' => 1287,
                   'errmsg' => '',
                   'stdout' => 'stdout: Hello from process 1287. args = (1 2 3 4)
  ',
                   'results' => [ { 'args' => [ 1, 2, 3, 4 ], 's' => 10 } ]
                 }, 'GRID::Machine::Process::Result' );

  GRID::Machine::Process::Result objects are overloaded in a string context:
  stdout: Hello from process 1287. args = (1 2 3 4)
  stderr: Hello from process 1287

The following example uses the fork method and waitall to compute in parallel a numerical approach to the value of the number pi:

  $ cat -n waitpi.pl 
     1  #!/usr/bin/perl -w
     2  use strict;
     3  use GRID::Machine;
     4  
     5  my $host = $ENV{GRID_REMOTE_MACHINE};
     6  my $machine = GRID::Machine->new( host => $host );
     7  
     8  my ($N, $np, $pi)  = (1000, 4, 0);
     9  for (0..$np-1) {
    10     $machine->fork( q{
    11         my ($id, $N, $np) = @_;
    12           
    13         my $sum = 0;
    14         for (my $i = $id; $i < $N; $i += $np) {
    15             my $x = ($i + 0.5) / $N;
    16             $sum += 4 / (1 + $x * $x);
    17         }
    18         $sum /= $N; 
    19      },
    20      args => [ $_, $N, $np ],
    21    );
    22  }
    23  
    24  $pi += $machine->waitall()->result for 1..$np;
    25  
    26  print "pi = $pi\n";

The async method

The async method it is quite similar to the fork method but receives as arguments the name of a GRID::Machine method and the arguments for this method. It executes asynchronously the method. It returns a GRID::Machine::Process object. Basically, the call

            $m->async($subname => @args) 

is equivalent to:

            $m->fork($subname.'(@_)' args => [ @args ] ) 

The following example uses async to compute in parallel an approximation to the value of pi:

  $ cat -n async.pl 
     1  #!/usr/bin/perl -w
     2  use strict;
     3  use GRID::Machine;
     4  
     5  my $host = $ENV{GRID_REMOTE_MACHINE};
     6  my $machine = GRID::Machine->new( host => $host );
     7  
     8  $machine->sub(sumareas => q{
     9         my ($id, $N, $np) = @_;
    10           
    11         my $sum = 0;
    12         for (my $i = $id; $i < $N; $i += $np) {
    13             my $x = ($i + 0.5) / $N;
    14             $sum += 4 / (1 + $x * $x);
    15         }
    16         $sum /= $N; 
    17  });
    18  
    19  my ($N, $np, $pi)  = (1000, 4, 0);
    20  
    21  $machine->async( sumareas =>  $_, $N, $np ) for (0..$np-1);
    22  $pi += $machine->waitall()->result for 1..$np;
    23  
    24  print "pi = $pi\n";

GRID::Machine::Process::Result objects

In a string context a GRID::Machine::Process::Result object produces the concatenation of its output to STDOUT followed by its output to STDERR. In a boolean context it evaluates according to its result attribute. It evaluates to true if it is an array reference with more than one element or if the only element is true. Otherwise it is false.

About the pi Example Used in this Manual

In this manual we have used several times as an example the computation of an approach to number Pi (3.14159...) using numerical integration. To understand it, take into account that the area under the curve 1/(1+x**2) between 0 and 1 is Pi/4 = (3.1415...)/4 as it shows the following debugger session:

  pp2@nereida:~/public_html/cgi-bin$ perl -wde 0
  main::(-e:1):   0
    DB<1>  use Math::Integral::Romberg 'integral'
    DB<2> p integral(sub { my $x = shift; 4/(1+$x*$x) }, 0, 1);
  3.14159265358972

The module Math::Integral::Romberg provides the function integral that allow us to compute the area of a given function in some interval. In fact - if you remember your high school days - it is easy to see the reason: the integral of 4/(1+$x*$x) is 4*arctg($x) and so its area between 0 and 1 is given by:

        4*(arctg(1) - arctg(0)) = 4 * arctg(1) = 4 * Pi / 4 = Pi

This is not, in fact, a good way to compute Pi, but makes a good example of how to exploit several machines to fulfill a task.

To compute the area under 4/(1+$x*$x) we have divided up the interval [0,1] into sub-intervals of size 1/N and add up the areas of the small rectangles with base 1/N and height the value of the curve 4/(1+$x*$x) in the middle of the interval. The following debugger session illustrates the idea:

  pp2@nereida:~$ perl -wde 0
  main::(-e:1):   0
  DB<1> use List::Util qw(sum)
  DB<2> $N = 6
  DB<3> @divisions = map { $_/$N } 0..($N-1)
  DB<4> sub f { my $x = shift; 4/(1+$x*$x) }
  DB<5> @halves = map { $_+0.5/$N } @divisions
  DB<6> $area = sum(map { f($_)/$N } @halves)
  DB<7> p $area
  3.14390742722244

To optimize the execution time we distribute the sum in line 6 $area = sum(map { f($_)/$N } @halves) among several processes. The processes are numbered from 0 to np-1. Each process sums up the areas of roughly N/np intervals. We can spedup the computation if each process is allocated to a different processor (or core).

AUTHOR ^

Casiano Rodriguez Leon <casiano@ull.es>

ACKNOWLEDGMENTS ^

This work has been supported by CEE (FEDER) and the Spanish Ministry of Educacion y Ciencia through Plan Nacional I+D+I number TIN2005-08818-C04-04 (ULL::OPLINK project http://www.oplink.ull.es/). Support from Gobierno de Canarias was through GC02210601 (Grupos Consolidados). The University of La Laguna has also supported my work in many ways and for many years.

I wish to thank Paul Evans for his IPC::PerlSSH module: it was the source of inspiration for this module. To Alex White, Dmitri Kargapolov, Eric Busto and Erik Welch for their contributions. To the Perl Monks, and the Perl Community for generously sharing their knowledge. Finally, thanks to Juana, Coro and my students at La Laguna.

LICENCE AND COPYRIGHT ^

Copyright (c) 2007 Casiano Rodriguez-Leon (casiano@ull.es). All rights reserved.

These modules are free software; you can redistribute it and/or modify it under the same terms as Perl itself. See perlartistic.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

syntax highlighting: