Marty O'Brien > Forks-Super > Forks::Super

Download:
Forks-Super-0.73.tar.gz

Dependencies

Annotate this POD

CPAN RT

Open  0
Stalled  1
View/Report Bugs
Module Version: 0.73   Source  

NAME ^

Forks::Super - extensions and convenience methods to manage background processes

VERSION ^

Version 0.73

SYNOPSIS ^

    use Forks::Super;
    use Forks::Super MAX_PROC => 5, DEBUG => 1;

    # --- familiar use - parent returns PID>0, child returns zero
    $pid = fork();
    die "fork failed" unless defined $pid;
    if ($pid > 0) {
        # parent code
    } else {
        # child code
    }

    # --- wait for a child process to finish
    $w = wait;                  # blocking wait on any child, $? holds child exit status
    $w = waitpid $pid, 0;       # blocking wait on specific child
    $w = waitpid $pid, WNOHANG; # non-blocking wait, use with POSIX ':sys_wait_h'
    $w = waitpid 0, $flag;      # wait on any process in current process group
    waitall;                    # block until all children are finished

    # -------------- helpful extensions ---------------------
    # fork directly to a shell command. Child doesn't return.
    $pid = fork { cmd => "./myScript 17 24 $n" };
    $pid = fork { exec => [ "/bin/prog" , $file, "-x", 13 ] };
    $pid = fork [ "./myScript", 17, 24, $n ];    # new syntax in v0.72

    # --- fork directly to a Perl subroutine. Child doesn't return.
    $pid = fork { sub => $methodNameOrRef , args => [ @methodArguments ] };
    $pid = fork { sub => \&subroutine, args => [ @args ] };
    $pid = fork { sub => sub { "anonymous sub" }, args => [ @args ] );
    $pid = fork sub { CODE }, %other_options;    # new syntax in v0.72

    # --- impose a time limit on the child process
    $pid = fork { cmd => $command,
                  timeout => 30 };            # kill child if not done in 30s
    $pid = fork { sub => $subRef , args => [ @args ],
                  expiration => 1260000000 }; # complete by 8AM Dec 5, 2009 UTC
    # --- wait and waitpid support timeouts, too
    $pid = wait 3.0;
    print "No child reaped in 5s" if waitpid 0, 0, 5.0 == &Forks::Super::Wait::TIMEOUT;

    # --- run a child process starting from a different directory
    $pid = fork { dir => "some/other/directory",
                  cmd => ["command", "--that", "--runs=somewhere", "else"] };

    # --- obtain standard filehandles for the child process
    $pid = fork { child_fh => "in,out,err,:utf8" };
    if ($pid == 0) {      # child process
       sleep 1;
       $x = <STDIN>; # read from parent's $pid->{child_stdin} (output handle)
       print rand() > 0.5 ? "Yes\n" : "No\n" if $x eq "Clean your room\n";
       sleep 2;
       $i_can_haz_ice_cream = <STDIN>;
       if ($i_can_haz_ice_cream !~ /you can have ice cream/ && rand() < 0.5) {
          print STDERR '@#$&#$*&#$*&',"\n";
       }
       exit 0;
    } # else parent process
    $child_stdin = $pid->{child_stdin};
    $child_stdin = $Forks::Super::CHILD_STDIN{$pid}; # alternate, deprecated
    print $child_stdin "Clean your room\n";
    sleep 2;
    $child_stdout = $pid->{child_stdout};
    # -or- $child_stdout = $Forks::Super::CHILD_STDOUT{$pid}; # deprecated
    $child_response = <$child_stdout>; # -or-: Forks::Super::read_stdout($pid);
    if ($child_response eq "Yes\n") {
       print $child_stdin "Good boy. You can have ice cream.\n";
    } else {
       print $child_stdin "Bad boy. No ice cream for you.\n";
       sleep 2;
       $child_err = Forks::Super::read_stderr($pid);
       # -or-  $child_err = $pid->read_stderr();
       # -or-  $child_err = readline($pid->{child_stderr});
       print $child_stdin "And no back talking!\n" if $child_err;
    }

    # --- retrieve variable values from a child process
    $pid1 = fork { share => [ \$scalar, \@list ], sub => \&method };
    $pid2 = fork { share => [ \@list, \%hash ], sub => \&someOtherMethod };
    waitpid $pid1, 0;
    waitpid $pid2, 0;
    # now $scalar is set to value in 1st job, @list has values from both jobs, %hash has values from 2nd job

    # ---------- manage jobs and system resources ---------------
    # --- runs 100 tasks but fork call blocks while there are already 5 active jobs
    $Forks::Super::MAX_PROC = 5;
    $Forks::Super::ON_BUSY = 'block';
    for ($i=0; $i<100; $i++) {
       $pid = fork { cmd => $task[$i] };
    }

    # --- jobs fail (without blocking) if the system is too busy
    $Forks::Super::MAX_LOAD = 2.0;
    $Forks::Super::ON_BUSY = 'fail';
    $pid = fork { cmd => $task };
    if    ($pid > 0) { print "'$task' is running\n" }
    elsif ($pid < 0) { print "current CPU load > 2.0: didn't start '$task'\n" }

    # $Forks::Super::MAX_PROC setting can be overridden.
    # Start job immediately if < 3 jobs running
    $pid = fork { sub => 'MyModule::MyMethod', args => [ @b ], max_proc => 3 };

    # --- try to fork no matter how busy the system is
    $pid = fork { sub => \&MyMethod, force => 1 }

    # when system is busy, queue jobs. When system is not busy,
    #     some jobs on the queue will start.
    # if job is queued, return value from fork() is a very negative number
    $Forks::Super::ON_BUSY = 'queue';
    $pid = fork { cmd => $command };
    $pid = fork { cmd => $useless_command, queue_priority => -5 };
    $pid = fork { cmd => $important_command, queue_priority => 5 };
    $pid = fork { cmd => $future_job, delay => 20 };  # queue job for at least 20s

    # --- assign descriptive names to tasks
    $pid1 = fork { cmd => $command, name => "my task" };
    $pid2 = waitpid "my task", 0;
    $num_signalled = Forks::Super::kill 'TERM', "my task";

    # --- run callbacks at various points of job life-cycle
    $pid = fork { cmd => $command, callback => \&on_complete };
    $pid = fork { sub => $sub, args => [ @args ],
                  callback => { start => 'on_start', finish => \&on_complete,
                                queue => sub { print "Job $_[1] queued\n" } } };

    # --- set up dependency relationships
    $pid1 = fork { cmd => $job1 };
    $pid2 = fork { cmd => $job2,
                   depend_on => $pid1 };            # put on queue until job 1 is complete
    $pid4 = fork { cmd => $job4,
                   depend_start => [$pid2,$pid3] }; # put on queue until jobs 2,3 have started

    $pid5 = fork { cmd => $job5, name => "group C" };
    $pid6 = fork { cmd => $job6, name => "group C" };
    $pid7 = fork { cmd => $job7,
                   depend_on => "group C" };        # wait for jobs 5 & 6 to complete

    # --- manage OS settings on jobs -- may not be available on all systems
    $pid1 = fork { os_priority => 10 };   # like nice(1) on Un*x
    $pid2 = fork { cpu_affinity => 0x5 }; # background task to prefer CPUs #0,2

    # --- job information
    $state = Forks::Super::state($pid);   # one of qw(ACTIVE DEFERRED COMPLETE REAPED)
    $status = Forks::Super::status($pid); # exit status ($?) for completed jobs

    # --- evaluate long running expressions in the background
    $result = bg_eval { a_long_running_calculation() };
    # sometime later ...
    print "Result was $result\n";

    $result = bg_qx( "./long_running_command" );
    # ... do something else for a while and when you need the output ...
    print "output of long running command was: $result\n";

    # if you need bg_eval or bg_qx functionality in list context ...
    tie @result, BG_EVAL, sub { long_running_calc() };
    tie @output, BG_QX, "./long_running_cmd";


    # --- convenience methods, compare to IPC::Open2, IPC::Open3
    my ($fh_in, $fh_out, $pid, $job) = Forks::Super::open2(@command);
    my ($fh_in, $fh_out, $fh_err, $pid, $job)
            = Forks::Super::open3(@command, { timeout => 60 });

    # --- run a background process as a *daemon*
    $job = fork { cmd => $cmd, daemon => 1 };

DESCRIPTION ^

This package provides new definitions for the Perl functions fork, wait, and waitpid with richer functionality. The new features are designed to make it more convenient to spawn background processes and more convenient to manage them to get the most out of your system's resources.

$pid = fork( \%options ) ^

Attempts to spawn a new process. On success, it returns a Forks::Super::Job object with information about the background task to the calling process. This object is overloaded so that in any numeric or string context, it will behave like the process id of the new process, and let's Forks::Super::fork be used as a drop-in replacement for the builtin Perl fork call.

With no arguments, it behaves the same as the Perl fork() system call:

Options for instructing the child process

The fork call supports three options, "cmd", "exec", and "sub" (or sub/args) that will instruct the child process to carry out a specific task. Using any of these options causes the child process not to return from the fork call.

cmd

$child_pid = fork { cmd => $shell_command }
$child_pid = fork { cmd => \@shell_command }

On successful launch of the child process, runs the specified shell command in the child process with the Perl system() function. When the system call is complete, the child process exits with the same exit status that was returned by the system call.

Returns the PID of the child process to the parent process. Does not return from the child process, so you do not need to check the fork() return value to determine whether code is executing in the parent or child process.

See "Alternate fork syntax", below, for an alternate way of specifying a command to run in a background process.

exec

$child_pid = fork { exec => $shell_command }
$child_pid = fork { exec => \@shell_command }

Like the "cmd" option, but the background process launches the shell command with exec instead of with system.

Using exec instead of cmd will usually spawn one fewer process. Prior to v0.55, the "timeout" and "expiration" options (see "Options for simple job management") could not be used with the exec option, but that incompatibility has been fixed.

sub

$child_pid = fork { sub => $subroutineName [, args => \@args ] }
$child_pid = fork { sub => \&subroutineReference [, args => \@args ] }
$child_pid = fork { sub => sub { ... code ... } [, args => \@args ] }

On successful launch of the child process, fork invokes the specified Perl subroutine with the specified set of method arguments (if provided) in the child process. If the subroutine completes normally, the child process exits with a status of zero. If the subroutine exits abnormally (i.e., if it die's, or if the subroutine invokes exit with a non-zero argument), the child process exits with non-zero status.

Returns the PID of the child process to the parent process. Does not return from the child process, so you do not need to check the fork() return value to determine whether code is running in the parent or child process.

If neither the "cmd", "exec", nor the "sub" option is provided to the fork call, then the fork() call behaves like a standard Perl fork() call, returning the child PID to the parent and also returning zero to a new child process.

See "Additional fork syntax", below, for an alternate way of specifying a subroutine to run in the child process.

Options for simple job management

timeout

expiration

fork { timeout => $delay_in_seconds }
fork { expiration => $timestamp_in_seconds_since_epoch_time }

Puts a deadline on the child process and causes the child to die if it has not completed by the deadline. With the timeout option, you specify that the child process should not survive longer than the specified number of seconds. With expiration, you are specifying an epoch time (like the one returned by the time function) as the child process's deadline.

If the setpgrp() system call is implemented on your system, then this module will try to reset the process group ID of the child process. On timeout, the module will attempt to kill off all subprocesses of the expiring child process.

If the deadline is some time in the past (if the timeout is not positive, or the expiration is earlier than the current time), then the child process will die immediately after it is created.

This feature usually uses Perl's alarm call and installs its own handler for SIGALRM, but an alternate "poor mans alarm" is available. If you wish to use the timeout or expiration feature with a child sub that also uses alarm/SIGALRM, or on a system that has issues with alarm, you can also pass the option use_alternate_alarm => 1 to force Forks::Super to use the alternate alarm.

If you have installed the DateTime::Format::Natural module, then you may also specify the timeout and expiration options using natural language:

    $pid = fork { timeout => "in 5 minutes", sub => ... };

    $pid = fork { expiration => "next Wednesday", cmd => $long_running_cmd };

dir

fork { dir => $directory }
fork { chdir => $directory }

Causes the child process to be run from a different directory than the parent.

If the specified directory does not exist or if the chdir call fails (e.g, if the caller does not have permission to change to the directory), then the child process immediately exits with a non-zero status.

chdir and dir are synonyms.

env

fork { env => \%values }

Passes additional environment variable settings to a child process.

umask

fork { umask => $mask }

Sets the "umask" of the background process to specify the default permissions of files and directories created by the background process. See "umask" in perlfunc and umask(1).

As it is with the Perl builtin function, the $mask argument is a number, usually given in octal form, but it is not a string of octal digits. So

    fork { umask => "0775" , ... }

will probably not do what you want. Instead, use one of

    fork { umask => 0775, ... }
    fork { umask => 509, ... }     # 509 == 0775
    fork { umask => oct "0775", ... }

delay

start_after

fork { delay => $delay_in_seconds }
fork { start_after => $timestamp_in_epoch_time }

Causes the child process to be spawned at some time in the future. The return value from a fork call that uses these features will not be a process id, but it will be a very negative number called a job ID. See the section on "Deferred processes" for information on what to do with a job ID.

A deferred job will start no earlier than its appointed time in the future. Depending on what circumstances the queued jobs are examined, the actual start time of the job could be significantly later than the appointed time.

A job may have both a minimum start time (through delay or start_after options) and a maximum end time (through "timeout" and "expiration"). Jobs with inconsistent times (end time is not later than start time) will be killed of as soon as they are created.

As with the "timeout" and "expiration" options, the delay and start_after options can be expressed in natural language if you have installed the DateTime::Format::Natural module.

    $pid = fork { start_after => "12:25pm tomorrow",  sub => ... };

    $pid = fork { delay => "in 7 minutes", cmd => ... };

child_fh

$pid = fork { child_fh => $fh_spec }
$pid = fork { child_fh => [ @fh_spec ] }

Launches a child process and makes the child process's STDIN, STDOUT, and/or STDERR filehandles available to the parent process in the instance members $pid->{child_stdin}, $job->{child_stdout}, and $pid->{child_stderr}, or in the package variables $Forks::Super::CHILD_STDIN{$pid}, $Forks::Super::CHILD_STDOUT{$pid}, and/or $Forks::Super::CHILD_STDERR{$pid}. $pid is the return value from the fork call. This feature makes it possible, even convenient, for a parent process to communicate with a child, as this contrived example shows.

    $pid = fork { sub => \&pig_latinize, timeout => 10,
                  child_fh => "all" };

    # in the parent, $Forks::Super::CHILD_STDIN{$pid} ($pid->{child_stdout})
    # is an **output** filehandle

    print {$pid->{child_stdin}} "The blue jay flew away in May\n";

    sleep 2; # give child time to start up and get ready for input

    # and $Forks::Super::CHILD_STDOUT{$pid} ($pid->{child_stdout}) and
    # $Forks::Super::CHILD_STDERR{$pid} ($pid->{child_stderr}
    # are **input** handles.

    $result = < { $pid->{child_stdout} } >;
    print "Pig Latin translator says: $result\n"; # ==> eThay ueblay ayjay ewflay awayay inay ayMay\n
    @errors = readline( $pid->{child_stderr} );
    print "Pig Latin translator complains: @errors\n" if @errors > 0;

    sub pig_latinize {
      for (;;) {
        while (<STDIN>) {
          foreach my $word (split /\s+/) {
            if ($word =~ /^qu/i) {
              print substr($word,2) . substr($word,0,2) . "ay";  # STDOUT
            } elsif ($word =~ /^([b-df-hj-np-tv-z][b-df-hj-np-tv-xz]*)/i) {
              my $prefix = $1;
              $word =~ s/[b-df-hj-np-tv-z][b-df-hj-np-tv-xz]*//i;
              print $word . $prefix . "ay";
            } elsif ($word =~ /^[aeiou]/i) {
              print $word . "ay";
            } else {
              print STDERR "Didn't recognize this word: $word\n";
            }
            print " ";
          }
          print "\n";
        }
      }
    }

The set of filehandles to make available are specified either as a non-alphanumeric delimited string, or list reference. This spec may contain one or more of the words:

    in
    out
    err
    join
    all
    socket
    pipe
    block
    :<layer>

in, out, and err mean that the child's STDIN, STDOUT, and STDERR, respectively, will be available in the parent process through the filehandles in $Forks::Super::CHILD_STDIN{$pid}, $Forks::Super::CHILD_STDOUT{$pid}, and $Forks::Super::CHILD_STDERR{$pid}, where $pid is the child's process ID. all is a convenient way to specify in, out, and err. join specifies that the child's STDOUT and STDERR will be returned through the same filehandle, specified as both $Forks::Super::CHILD_STDOUT{$pid} and $Forks::Super::CHILD_STDERR{$pid}.

If socket is specified, then local sockets will be used to pass between parent and child instead of temporary files.

If pipe is specified, then local pipes will be used to pass between parent and child instead of temporary files.

If block is specified, then the read end of each filehandle will block until input is available. Note that this can lead to deadlock unless the I/O of the write end of the filehandle is carefully managed.

:<layer> may be any valid PerlIO I/O layer, such as :crlf, :utf8, :gzip, etc. Some I/O layers may not work well with socket and pipe IPC. And of course they will not work well with Perl v<=5.6 and its poorer support for I/O layers.

See also: "write_stdin", "read_stdout", "read_stderr".

Socket handles vs. file handles vs. pipes

Here are some things to keep in mind when deciding whether to use sockets, pipes, or regular files for parent-child IPC:

Socket and file handle gotchas

Some things to keep in mind when using socket or file handles to communicate with a child process.

stdin

fork { stdin => $input }

Provides the data in $input as the child process's standard input. Equivalent to, but a little more efficient than:

    $pid = fork { child_fh => "in", sub => sub { ... } };
    Forks::Super::write_stdin($pid, $input);

$input may either be a scalar, a reference to a scalar, or a reference to an array.

stdout

stderr

fork { stdout => \$output }
fork { stderr => \$errput }

On completion of the background process, loads the standard output and standard error of the child process into the given scalar references. If you do not need to use the child's output while the child is running, it could be more convenient to use this construction than calling Forks::Super::read_stdout($pid) (or readline($pid->{child_stdout})) to obtain the child's output.

retries

fork { retries => $max_retries }

If the underlying system fork call fails (returns undef), pause for a short time and retry up to $max_retries times.

This feature is probably not that useful. A failed fork call usually indicates some bad system condition (too many processes, system out of memory or swap space, impending kernel panic, etc.) where your expectations of recovery should not be too high.

Options for complicated job management

The fork() call from this module supports options that help to manage child processes or groups of child processes in ways to better manage your system's resources. For example, you may have a lot of tasks to perform in the background, but you don't want to overwhelm your (possibly shared) system by running them all at once. There are features to control how many, how, and when your jobs will run.

name

fork { name => $name }

Attaches a string identifier to the job. The identifier can be used for several purposes:

daemon

fork { daemon => 1 }

Launches the background process as a daemon, partially severing the relationship between the parent and child process.

Features of daemon process:

  • closes all open file descriptors from the parent
  • begins in root directory "/" unless the dir => ... option is specified
  • has umask of zero unless umask => ... option specified
  • daemon will not be affected by signals to the parent

The following restrictions apply to daemon processes:

Also note that daemon processes will not count against the $Forks::Super::MAX_PROC limits.

max_proc

$Forks::Super::MAX_PROC = $max_simultaneous_jobs
fork { max_proc => $max_simultaneous_jobs }

Specifies the maximum number of background processes that you want to allow to run simultaneously. If a fork call is attempted while there are already the maximum number of child processes running, then the fork() call will either block (until some child processes complete), fail (return a negative value without spawning the child process), or queue the job (returning a very negative value called a job ID), according to the specified "on_busy" behavior (see "on_busy", below). See the "Deferred processes" section for information about how queued jobs are handled.

On any individual fork call, the maximum number of processes may be overridden by also specifying max_proc or "force" options.

    $Forks::Super::MAX_PROC = 8;
    # launch 2nd job only when system is very not busy
    # always launch 3rd job no matter how busy we are
    $pid1 = fork { sub => 'method1' };
    $pid2 = fork { sub => 'method2', max_proc => 1 };
    $pid3 = fork { sub => 'method3', force => 1 };

Setting $Forks::Super::MAX_PROC to zero or a negative number will disable the check for too many simultaneous processes.

max_fork is a synonym for max_proc.

max_load

$Forks::Super::MAX_LOAD = $max_cpu_load
fork { max_load => $max_cpu_load }

Specifies a maximum CPU load threshold. The fork command will not spawn any new jobs while the current system CPU load is larger than this threshold. CPU load checks are disabled if this value is set to zero or to a negative number.

Note that the metric of "CPU load" is different on different operating systems. On Windows (including Cygwin), the metric is CPU utilization, which is always a value between 0 and 1. On Unix-ish systems, the metric is the 1-minute system load average, which could be a value larger than 1. Also note that the 1-minute average load measurement has a lot of inertia -- after a CPU intensive task starts or stops, it will take at least several seconds for that change to impact the 1-minute utilization.

If your system does not have a well-behaved uptime(1) command, then you may need to install the Sys::CpuLoadX module to use this feature. For now, the Sys::CpuLoadX module is only available bundled with Forks::Super and otherwise cannot be downloaded from CPAN.

on_busy

$Forks::Super::ON_BUSY = "block" | "fail" | "queue"
fork { on_busy => "block" | "fail" | "queue" }

Dictates the behavior of fork in the event that the module is not allowed to launch the specified job for whatever reason. If you are using Forks::Super to throttle (see max_proc, $Forks::Super::MAX_PROC) or impose dependencies on (see depend_start, depend_on) background processes, then failure to launch a job should be expected.

block

If the module cannot create a new child process for the specified job, it will wait and periodically retry to create the child process until it is successful. Unless a system fork call is attempted and fails, fork calls that use this behavior will return a positive PID.

fail

If the module cannot immediately create a new child process for the specified job, the fork call will return with a small negative value.

queue

If the module cannot create a new child process for the specified job, the job will be deferred, and an attempt will be made to launch the job at a later time. See "Deferred processes" below. The return value will be a very negative number (job ID).

On most individual fork call, the default launch failure behavior specified by $Forks::Super::ON_BUSY can be overridden by specifying a on_busy option:

    $Forks::Super::ON_BUSY = "fail";
    $pid1 = fork { sub => 'myMethod' };
    $pid2 = fork { sub => 'yourMethod', on_busy => "queue" }; 

Note that jobs that use any of the "delay", "start_after", "depend_on", or "depend_start" options ignore this setting and always put the job on the deferred job queue (unless a different on_busy attribute is explicitly provided).

force

fork { force => $bool }

If the force option is set, the fork call will disregard the usual criteria for deciding whether a job can spawn a child process, and will always attempt to create the child process.

queue_priority

fork { queue_priority => $priority }

In the event that a job cannot immediately create a child process and is put on the job queue (see "Deferred processes"), the C{queue_priority} specifies the relative priority of the job on the job queue. In general, eligible jobs with high priority values will be started before jobs with lower priority values.

depend_on

depend_start

fork { depend_on => $id }
fork { depend_on => [ $id_1, $id_2, ... ] }
fork { depend_start => $id }
fork { depend_start => [ $id_1, $id_2, ... ] }

Indicates a dependency relationship between the job in this fork call and one or more other jobs. The identifiers may be process/job IDs or "name" attributes (see above) from earlier fork calls.

If a fork call specifies a depend_on option, then that job will be deferred until all of the child processes specified by the process or job IDs have completed. If a fork call specifies a depend_start option, then that job will be deferred until all of the child processes specified by the process or job IDs have started.

Invalid process and job IDs in a depend_on or depend_start setting will produce a warning message but will not prevent a job from starting.

Dependencies are established at the time of the fork call and can only apply to jobs that are known at run time. So for example, in this code,

    $job1 = fork { cmd => $cmd, name => "job1", depend_on => "job2" };
    $job2 = fork { cmd => $cmd, name => "job2", depend_on => "job1" };

at the time the first job is cereated, the job named "job2" has not been created yet, so the first job will not have a dependency (and a warning will be issued when the job is created). This may be a limitation but it also guarantees that there will be no circular dependencies.

When a dependency identifier is a name attribute that applies to multiple jobs, the job will be dependent on all existing jobs with that name:

    # Job 3 will not start until BOTH job 1 and job 2 are done
    $job1 = fork { name => "Sally", ... };
    $job2 = fork { name => "Sally", ... };
    $job3 = fork { depend_on => "Sally", ... };

    # all of these jobs have the same name and depend on ALL previous jobs
    $job4 = fork { name => "Ralph", depend_start => "Ralph", ... }; # no dependencies
    $job5 = fork { name => "Ralph", depend_start => "Ralph", ... }; # depends on Job 4
    $job6 = fork { name => "Ralph", depend_start => "Ralph", ... }; # depends on #4 and #5

The default "on_busy" behavior for jobs with dependencies is to go on to the job queue, ignoring the value of "ON_BUSY" in $Forks::Super::ON_BUSY (but not ignoring the on_busy attribute passed to the job, if any).

can_launch

fork { can_launch => \&methodName }
fork { can_launch => sub { ... anonymous sub ... } }

Supply a user-specified function to determine when a job is eligible to be started. The function supplied should return 0 if a job is not eligible to start and non-zero if it is eligible to start.

During a fork call or when the job queue is being examined, the user's can_launch method will be invoked with a single Forks::Super::Job argument containing information about the job to be launched. User code may make use of the default launch determination method by invoking the _can_launch method of the job object:

    # Running on a BSD system with the uptime(1) call.
    # Want to block jobs when the current CPU load
    # (1 minute) is greater than 4 and respect all other criteria:
    fork { cmd => $my_command,
           can_launch => sub {
             $job = shift;                    # a Forks::Super::Job object
             return 0 if !$job->_can_launch;  # default
             $cpu_load = (split /\s+/,`uptime`)[-3]; # get 1 minute avg CPU load
             return 0 if $cpu_load > 4.0;     # system too busy. let's wait
             return 1;
           } }

callback

fork { callback => $subroutineName }
fork { callback => sub { BLOCK } }
fork { callback => { start => ..., finish => ..., queue => ..., fail => ... } }

Install callbacks to be run when and if certain events in the life cycle of a background process occur. The first two forms of this option are equivalent to

    fork { callback => { finish => ... } }

and specify code that will be executed when a background process is complete and the module has received its SIGCHLD event. A start callback is executed just after a new process is spawned. A queue callback is run if and only if the job is deferred for any reason (see "Deferred processes") and the job is placed onto the job queue for the first time. And the fail callback is run if the job is not going to be launched (that is, a case where the fork call would return -1).

Callbacks are invoked with two arguments: the Forks::Super::Job object that was created with the original fork call, and the job's ID (the return value from fork).

You should keep your callback functions short and sweet, like you do for your signal handlers. Sometimes callbacks are invoked from a signal handler, and the processing of other signals could be delayed if the callback functions take too long to run.

suspend

fork { suspend => 'subroutineName' } }
fork { suspend => \&subroutineName } }
fork { suspend => sub { ... anonymous sub ... } }

Registers a callback function that can indicate when a background process should be suspended and when it should be resumed. The callback function will receive one argument -- the Forks::Super::Job object that owns the callback -- and is expected to return a numerical value. The callback function will be evaluated periodically (for example, during the productive downtime of a "wait"/"waitpid" call or Forks::Super::Util::pause() call).

When the callback function returns a negative value and the process is active, the process will be suspended.

When the callback function returns a positive value while the process is suspended, the process will be resumed.

When the callback function returns 0, the job will remain in its current state.

    my $pid = fork { exec => "run-the-heater",
                     suspend => sub {
                       my $t = get_temperature(); # in degrees Fahrenheit
                       if ($t < 68) {
                           return +1;  # too cold, make sure heater is on
                       } elsif ($t > 72) {
                           return -1;  # too warm, suspend the heater process
                       } else {
                           return 0;   # leave it on or off
                       }
                    } };

share

fork { share => [ list-of-references ] }

Allows variables in the parent process to be updated when the child exits.

Input is a listref of references -- scalar, list, or hash references -- that may be updated in a child process. When the child process finishes, the values of these variables in the parent are updated with the values that were in the child on its exit. The value of a scalar variable will be overwritten with the child value, arrays and hashes will be appended with the child values.

    use Forks::Super;
    my $a = 'old value';
    my @a = 1..5;
    my %a = (abc => 'def');
    $job = fork {
        share => [ \$a, \@a, \%a ],
        sub => {
           $a = 'new value';
           @a = qw(foo bar);
           %a = (bar => 'foo', 19 => 42);
        }
    };
    waitpid $job, 0;
    print "\$a now contains $a\n";     # scalar overwritten => 'new value'
    print "\@a now contains @a\n";     # list appended => 1 2 3 4 5 foo bar
    print "\%a now contains ",keys %a,"\n"; # hash appended => abc,bar,19

This option is not meaningful when used with the cmd or exec options.

If you use the share option in perl's "taint" mode, you will also need to pass an untaint => 1 option to the fork call.

sync

fork { sync => $n }
fork { sync => 'string' }
fork { sync => \@list }

Creates one or more synchronization objects that will be accessible to both the parent and child processes.

The argument to the sync option is either a number, a string consisting of 'C', 'P', and 'N' characters, or a list reference consisting of 'C', 'P', and 'N' elements. For a string or list reference input, the number of synchronization objects created will be the length of the string or length of the list. The values 'C', 'P', and 'N' determine which process initially has exclusive access to each synchronization object after the fork. 'C' means that the child process should begin with exclusive access to the resource, 'P' means that the parent process should begin with exclusive access to the resource, and 'N' means that neither process should have access to the resource after the fork.

Both of these calls create 3 synchronization objects to be shared between a parent and child process. The first resource is initially held by the parent, the second resource is initially held by the child, and the third resource is not held by either process:

    $pid = fork { sync => 'PCN' };
    $pid = fork { sync => ['P','C','N'] };

Using the sync option with a numeric value will create that number of synchronization objects, with none of the objects initially held by either the parent or child process. That is, these three uses of the sync option are equivalent:

    $pid = fork { sync => 2 };
    $pid = fork { sync => 'NN' };
    $pid = fork { sync => ['N','N'] };

After the fork, the parent and child processes can acquire and release exclusive access to these objects with the acquire and release methods of the Forks::Super::Job object.

Synchronization objects are useful for coordinating activity between a parent and child processes. You could use a synchronization object to coordinate appending to a common file, for example.

    # in parent:
    $job->acquire(0);
    open my $fh, '>>', $common_file;
    print $fh $some_message_from_parent;
    close $fh;
    $job->release(0);

    # in child:
    Forks::Super->acquire(0);
    open my $fh, '>>', $common_file;
    print $fh $some_message_from_child;
    close $fh;
    Forks::Super->release(0);

os_priority

fork { os_priority => $priority }

On supported operating systems, and after the successful creation of the child process, attempt to set the operating system priority of the child process, using your operating system's notion of what priority is.

On unsupported systems, this option is ignored.

cpu_affinity

fork { cpu_affinity => $bitmask }
fork { cpu_affinity => [ @list_of_processors ] }

On supported operating systems with multiple cores, and after the successful creation of the child process, attempt to set the child process's CPU affinity.

In the scalar style of this option, each bit of the bitmask represents one processor. Set a bit to 1 to allow the process to use the corresponding processor, and set it to 0 to disallow the corresponding processor.

For example, to bind a new child process to use CPU #s 2 and 3 on a system with (at least) 4 processors, you would call one of

    fork { cpu_affinity => 12 , ... } ;    # 12 = 1<<2 + 1<<3
    fork { cpu_affinity => [2,3] , ... };

There may be additional restrictions on the range of valid values for the cpu_affinity option imposed by the operating system. See the Sys::CpuAffinity docs for discussion of some of these restrictions.

This feature requires the Sys::CpuAffinity module. The Sys::CpuAffinity module is bundled with Forks::Super, or it may be obtained from CPAN.

debug

undebug

fork { debug => $bool }
fork { undebug => $bool }

Overrides the debugging setting in $Forks::Super::DEBUG (see "MODULE VARIABLES") for this specific job. If specified, the debug parameter controls only whether the module will output debugging information related to the job created by this fork call.

Normally, the debugging settings of the parent, including the job-specific settings, are inherited by child processes. If the undebug option is specified with a non-zero parameter value, then debugging will be disabled in the child process.

Deferred processes

Whenever some condition exists that prevents a fork() call from immediately starting a new child process, an option is to defer the job. Deferred jobs are placed on a queue. At periodic intervals, in response to periodic events, or whenever you invoke the Forks::Super::Queue::check_queue method in your code, the queue will be examined to see if any deferred jobs are eligible to be launched.

Job ID

When a fork() call fails to spawn a child process but instead defers the job by adding it to the queue, the fork() call will return a unique, large negative number called the job ID. The number will be negative and large enough (<= -100000) so that it can be distinguished from any possible PID, Windows pseudo-process ID, process group ID, or fork() failure code.

Although the job ID is not the actual ID of a system process, it may be used like a PID as an argument to "waitpid", as a dependency specification in another fork call's "depend_on" or "depend_start" option, or the other module methods used to retrieve job information (See "Obtaining job information" below). Once a deferred job has been started, it will be possible to obtain the actual PID (or on Windows, the actual psuedo-process ID) of the process running that job.

Job priority

Every job on the queue will have a priority value. A job's priority may be set explicitly by including the "queue_priority" option in the fork() call, or it will be assigned a default priority near zero. Every time the queue is examined, the queue will be sorted by this priority value and an attempt will be made to launch each job in this order. Note that different jobs may have different criteria for being launched, and it is possible that that an eligible low priority job may be started before an ineligible higher priority job.

Queue examination

Certain events in the SIGCHLD handler or in the "wait", "waitpid", and/or "waitall" methods will cause the list of deferred jobs to be evaluated and to start eligible jobs. But this configuration does not guarantee that the queue will be examined in a timely or frequent enough basis. The user may invoke the

    Forks::Super::Queue:check_queue()

method at any time to force the queue to be examined.

Special tips for Windows systems

On POSIX systems (including Cygwin), programs using the Forks::Super module are interrupted when a child process completes. A callback function performs some housekeeping and may perform other duties like trying to dispatch things from the list of deferred jobs.

Windows systems do not have the signal handling capabilities of other systems, and so other things equal, a script running on Windows will not perform the housekeeping tasks as frequently as a script on other systems.

The method Forks::Super::pause can be used as a drop in replacement for the Perl sleep call. In a pause function call, the program will check on active child processes, reap the ones that have completed, and attempt to dispatch jobs on the queue.

Calling pause with an argument of 0 is also a valid way of invoking the child handler function on Windows. When used this way, pause returns immediately after running the child handler.

Child processes are implemented differently in Windows than in POSIX systems. The CORE::fork and Forks::Super::fork calls will usually return a pseudo-process ID to the parent process, and this will be a negative value. The Unix idiom of testing whether a fork call returns a positive number needs to be modified on Windows systems by testing whether Forks::Super::isValidPid($pid) returns true, where $pid is the return value from a Forks::Super::fork call.

Alternate fork syntax ^

Since v0.72, the fork function recognizes these additional syntax:

fork \&code, %options

fork \&code, \%options

If the first argument to fork is a code reference, then it is treated like a "sub" argument, and is equivalent to the call

    fork { sub => \&code, %options }

fork \@cmd, %options

fork \@cmd, \%options

If the first argument to fork is an array reference, then it is treated like a "cmd" argument, and is equivalent to the call

    fork { cmd => \@cmd, %options }

OTHER FUNCTIONS ^

Process monitoring and signalling

wait

$reaped_pid = wait [$timeout]

Like the Perl wait system call, blocks until a child process terminates and returns the PID of the deceased process, or -1 if there are no child processes remaining to reap. The exit status of the child is returned in $?.

This version of the wait call can take an optional $timeout argument, which specifies the maximum length of time in seconds to wait for a process to complete. If a timeout is supplied and no process completes before the timeout expires, then the wait function returns the value -1.5 (you can also test if the return value of the function is the same as Forks::Super::TIMEOUT, which is a constant to indicate that a wait call timed out).

If wait (or "waitpid" or "waitall") is called when all jobs are either complete or suspended, and there is at least one suspended job, then the behavior is governed by the setting of the $Forks::Super::WAIT_ACTION_ON_SUSPENDED_JOBS variable.

waitpid

$reaped_pid = waitpid $pid, $flags [, $timeout]

Waits for a child with a particular PID or a child from a particular process group to terminate and returns the PID of the deceased process, or -1 if there is no suitable child process to reap. If the return value contains a PID, then $? is set to the exit status of that process.

A valid job ID (see "Deferred processes") may be used as the $pid argument to this method. If the waitpid call reaps the process associated with the job ID, the return value will be the actual PID of the deceased child.

Note that the waitpid function can wait on a job ID even when the job associated with that ID is still in the job queue, waiting to be started.

A $pid value of -1 waits for the first available child process to terminate and returns its PID.

A $pid value of 0 waits for the first available child from the same process group of the calling process.

A negative $pid that is not recognized as a valid job ID will be interpreted as a process group ID, and the waitpid function will return the PID of the first available child from the same process group.

On some^H^H^H^H every modern system that I know about, a $flags value of POSIX::WNOHANG is supported to perform a non-blocking wait. See the Perl waitpid documentation.

If the optional $timeout argument is provided, the waitpid function will block for at most $timeout seconds, and return -1.5 (or Forks::Super::TIMEOUT if a suitable process is not reaped in that time.

waitall

$count = waitall [$timeout]

Blocking wait for all child processes, including deferred jobs that have not started at the time of the waitall call. Return value is the number of processes that were waited on.

If the optional $timeout argument is supplied, the function will block for at most $timeout seconds before returning.

kill

$num_signalled = Forks::Super::kill $signal, @jobsOrPids

A cross-platform process signalling function. Sends "signals" to the background processes specified by process IDs, job names, or Forks::Super::Job objects. Returns the number of jobs that were successfully signalled.

This method "does what you mean" with respect to terminating, suspending, or resuming processes. This method may "send signals" to jobs in the job queue (that don't even have a proper process id yet), or signal processes on Windows systems (which do not have a Unix-like signals framework). The appropriate Windows API calls are used to communicate with Windows processes and threads. It is highly recommended that you install the Win32::API module for this purpose.

See also the Forks::Super::Job::suspend and resume methods. It is preferable (out of portability concerns) to use these methods

    $job->suspend;
    $job->resume;

rather than Forks::Super::kill.

    Forks::Super::kill 'STOP', $job;
    Forks::Super::kill 'CONT', $job;

kill_all

$num_signalled = Forks::Super::kill_all $signal

Sends a "signal" (see expanded meaning of "signal" in "kill", above). to all relevant processes spawned from the Forks::Super module.

isValidPid

Forks::Super::isValidPid( $pid )

Tests whether the return value of a fork call indicates that a background process has been successfully created or not. On POSIX-y systems it is sufficient to check whether $pid is a positive integer, but isValidPid is a more portable way to test the return value as it also identifies psuedo-process IDs on Windows systems, which are typically negative numbers.

isValidPid will return false for a large negative process id, which the fork call returns to indicate that a job has been deferred (see "Deferred processes"). Of course it is possible that the job will run later and have a valid process id associated with it.

Interprocess communication functions

read_stdout

read_stderr

$line = Forks::Super::read_stdout($pid [,%options] )
@lines = Forks::Super::read_stdout($pid [,%options] )
$line = Forks::Super::read_stderr($pid [, %options])
@lines = Forks::Super::read_stderr($pid [, %options] )
$line = $job->read_stdout( [%options] )
@lines = $job->read_stdout( [%options] )
$line = $job->read_stderr( [%options])
@lines = $job->read_stderr( [%options] )

For jobs that were started with the child_fh => "out" and child_fh => "err" options enabled, read data from the STDOUT and STDERR filehandles of child processes.

Aside from the more readable syntax, these functions may be preferable to some alternate ways of reading from an interprocess I/O handle

    $line = < {$Forks::Super::CHILD_STDOUT{$pid}} >;
    @lines = < {$job->{child_stdout}} >;
    @lines = < {$Forks::Super::CHILD_STDERR{$pid}} >;
    $line = < {$job->{child_stderr}} >;

because the read_stdout and read_stderr functions will

  • clear the EOF condition when the parent is reading from the handle faster than the child is writing to it
  • not block.

Functions work in both scalar and list context. If there is no data to read on the filehandle, but the child process is still active and could put more data on the filehandle, these functions return "" (empty string) in scalar context and () (empty list) in list context. If there is no more data on the filehandle and the child process is finished, the return values of the functions will be undef.

These methods all take any number of arbitrary key-value pairs as additional arguments. There are currently three recognized options to these methods:

  • block => 0 | 1

    Determines whether blocking I/O is used on the file, socket, or pipe handle. If enabled, the read_stdXXX function will hang until input is available or until the module can determine that the process creating input for that handle has completed. Blocking I/O can lead to deadlocks unless you are careful about managing the process creating input for the handle. The default mode is non-blocking.

  • warn => 0 | 1

    If warnings on the read_stdXXX function are disabled, then some warning messages (reading from a closed handle, reading from a non-existent/unconfigured handle) will be suppressed. Enabled by default.

    Note that the output of the child process may be buffered, and data on the channel that read_stdout and read_stderr read from may not be available until the child process has produced a lot of output, or until the child process has finished. Forks::Super will make an effort to autoflush the filehandles that write from one process and are read in another process, but assuring that arbitrary external commands will flush their output regularly is beyond the scope of this module.

  • timeout => $num_seconds

    On an otherwise non-blocking filehandle, waits up to the specified number of seconds for input to become available.

getc_stdout

getc_stderr

$char = $job->getc_stdout( [%options] )
$char = $job->getc_stderr( [%options] )

Retrieves a single character from a child process output stream, if available. Supports the same block, timeout, and warn options as the "read_stdout" and "read_stderr" functions.

<$job>

The <> operator has been overloaded for the Forks::Super::Job package such that calling

    <$job>

is equivalent to calling

    scalar $job->read_stdout()

(Due to a limitation of overloading in Perl, this construction cannot be used in a list context.)

close_fh

Forks::Super::close_fh($pid)
Forks::Super::close_fh($pid, 'stdin', 'stdout', 'stderr')

Closes the specified open file handles and socket handles for interprocess communication with the specified child process. With no additional arguments, closes all open handles for the process.

Most operating systems impose a hard limit on the number of filehandles that can be opened in a process simultaneously, so you should use this function when you are finished communicating with a child process so that you don't run into that limit.

See also "close_fh" in Forks::Super::Job.

open2

open3

($in,$out,$pid,$job) = Forks::Super::open2( @command [, \%options ] )
($in,$out,$err,$pid,$job) = Forks::Super::open3( @command [, \%options] )

Starts a background process and returns filehandles to the process's standard input and standard output (and standard error in the case of the open3 call). Also returns the process id and the Forks::Super::Job object associated with the background process.

Compare these methods to the main functions of the IPC::Open2 and IPC::Open3 modules.

Many of the options that can be passed to Forks::Super::fork can also be passed to Forks::Super::open2 and Forks::Super::open3:

    # run a command but kill it after 30 seconds
    ($in,$out,$pid) = Forks::Super::open2("ssh me\@mycomputer ./runCommand.sh", { timeout => 30 });

    # invoke a callback when command ends
    ($in,$out,$err,$pid,$job) = Forks::Super::open3(@cmd, {callback => sub { print "\@cmd finished!\n" }});

bg_eval

$result = bg_eval { BLOCK }
$result = bg_eval { BLOCK } { option => value, ... }

API change since v0.43: In scalar context, the result is now an overloaded object that retrieves its value when it is used in an expression. It is no longer a scalar reference that retrieves its value when it is dereferenced.

API change since v0.52: list context is no longer directly supported. To retrieve results of a code block evaluated in the background in list context, see Forks::Super::bg_eval tied class.

Launches a block of code in a background process, returning immediately. The next time the result of the function call is referenced, interprocess communication is used to retrieve the result of the child process, waiting until the child finishes, if necessary.

    $result = bg_eval { sleep 3; return 42 };  # this line returns immediately
    print "Result was $result\n";              # this line takes 3 seconds to execute

With the bg_eval function, you can perform other tasks while waiting for the results of another task to be available.

    $result = bg_eval { sleep 5; return [1,2,3] };
    do_something_that_takes_about_5_seconds();
    print "Result was @$result\n";              # now this line probably runs immediately

The background process is spawned with the Forks::Super::fork call, and will block, fail, or defer a job in accordance with all the other rules of this module. Additional options may be passed to bg_eval that will be provided to the fork call. Most valid options to the fork call are also valid for the bg_eval call, including timeouts, delays, job dependencies, names, and callbacks. This example will populate $result with the value undef if the bg_eval operation takes longer than 60 seconds.

    # run task in background, but timeout after 20 seconds
    $result = bg_eval {
        download_from_teh_Internet($url, @options)
    } { timeout => 20, os_priority => 3 };
    do_something_else();
    if (!defined($result)) {
        # operation probably timed out ...
    } else {
        # operation probably succeeded, use $result
    }

An additional option that is recognized by bg_eval (and "bg_qx", see below) is untaint. If you are running perl in "taint" mode, the value(s) returned by bg_eval and bg_qx are likely to be "tainted". By passing the untaint option (assigned to a true value), the values returned by bg_eval and bg_qx will be taint clean.

Calls to bg_eval (and "bg_qx") will populate the variables $Forks::Super::LAST_JOB and $Forks::Super::LAST_JOB_ID with the Forks::Super::Job object and the job id, respectively, for the job created by the bg_eval/bg_qx call. See "LAST_JOB" in MODULE VARIABLES below.

See also: "bg_qx".

bg_qx

$result = bg_qx $command
$result = bg_qx $command, { option => value, ... }

API change since v0.43: In scalar context, the result is now an overloaded object that retrieves its value when it is used in an expression. It is no longer a scalar reference that retrieves its value when it is dereferenced.

API change since v0.52: list context is no longer directly supported. To receive output of a background process in list context, see Forks::Super::bg_qx tied class.

Launches an external program and returns immediately. Execution of the command continues in a background process. When the command completes, interprocess communication copies the output of the command into the result (left hand side) variable. If the result variable is referenced again before the background process is complete, the program will wait until the background process completes. A job that fails or otherwise produces no output will return the empty string ("").

Think of this command as a background version of Perl's backticks or qx() function (albeit one that can only work in scalar context).

The background job will be spawned with the Forks::Super::fork call, and the command can block, fail, or defer a background job in accordance with all of the other rules of this module. Additional options may be passed to bg_qx that will be provided to the fork call. For example,

    $result = bg_qx "nslookup joe.schmoe.com", { timeout => 15 }

will run nslookup in a background process for up to 15 seconds. The next time $result is referenced in the program, it will contain all of the output produced by the process up until the time it was terminated. Most valid options for the fork call are also valid options for bg_qx, including timeouts, delays, job dependencies, names, and callbacks. The only invalid options for bg_qx are "cmd", "sub", "exec", and "child_fh".

Like "bg_eval", a call to bg_qx will populate the variables $Forks::Super::LAST_JOB and $Forks::Super::LAST_JOB_ID with the Forks::Super::Job object and the job id, respectively, for the job created by the bg_qx call. See "LAST_JOB" under MODULE VARIABLES below.

See also: "bg_eval".

Forks%3A%3ASuper%3A%3Abg_eval_tied_class

Forks%3A%3ASuper%3A%3Abg_qx_tied_class

Forks::Super::bg_eval tied class

Forks::Super::bg_qx tied class

tie $result, 'Forks::Super::bg_eval', sub { CODE }, \%options
tie @result, 'Forks::Super::bg_eval', sub { CODE }, \%options
tie %result, 'Forks::Super::bg_eval', sub { CODE }, \%options
tie $output, 'Forks::Super::bg_qx', $command, \%options
tie @output, 'Forks::Super::bg_qx', $command, \%options
tie %output, 'Forks::Super::bg_qx', $command, \%options

Alternative calls to "bg_eval" and "bg_qx" functions that also work in list context.

Instead of calling

    my $result = long_running_function($arg1, $arg2);
    my @output = qx(some long running command);
    my %hash = long_running_function_that_returns_hash();

you could say

    tie my $result,'Forks::Super::bg_eval',sub{long_running_function($arg1,$arg2)};
    tie my @output,'Forks::Super::bg_qx',qq[some long running command];
    tie my %hash,'Forks::Super::bg_eval',sub{long_running_function_that_returns_hash()};

The result of each of these expressions is to tie a variable to the result of a background process. Like bg_qx and bg_eval, these expressions spawn a background process and return immediately. Also like bg_qx and bg_eval, the module retrieves the results of the background operation the next time the tied variables are evaluated, waiting for the background process to finish if necessary.

Like other bg_qx and bg_eval calls, these expressions respect most of the additional options that you can pass to Forks::Super::fork.

    tie my @output, 'Forks::Super::bg_qx', "ssh me@remotehost who", { timeout => 10 };
    tie my %result, 'Forks::Super::bg_eval', \&my_function, { cpu_affinity => 0x2 };

Note: the constants BG_QX and BG_EVAL are exported by default, and provide a convenient shorthand for "Forks::Super::bg_qx" and "Forks::Super::bg_eval", respectively. So you could rewrite the previous two expressions as

    tie my @output, BG_QX, "ssh me@remotehost who", { timeout => 10 };
    tie my %result, BG_EVAL, \&my_function, { cpu_affinity => 0x2 };

Miscellaneous functions

pause

Forks::Super::pause($delay)

A productive drop-in replacement for the Perl sleep system call (or Time::HiRes::sleep, if available). On systems like Windows that lack a proper method for handling SIGCHLD events, the Forks::Super::pause method will occasionally reap child processes that have completed and attempt to dispatch jobs on the queue.

On other systems, using Forks::Super::pause is less vulnerable than sleep to interruptions from this module (See "BUGS AND LIMITATIONS" below).

Obtaining job information

Forks::Super::Job::get

$job = Forks::Super::Job::get($pid)

Returns a Forks::Super::Job object associated with process ID or job ID $pid. See Forks::Super::Job for information about the methods and attributes of these objects.

This subroutine is mainly redundant since v0.41, where the default return value of fork is an overloaded Forks::Super::Job object instead of a simple scalar process id.

Forks::Super::Job::getByName

@jobs = Forks::Super::Job::getByName($name)

Returns zero of more Forks::Super::Job objects with the specified job name. A job has a name if a "name" parameter was provided in the Forks::Super::fork call.

state

$state = Forks::Super::state($pid)

Returns the state of the job specified by the given process ID, job ID, or job name. See "state" in Forks::Super::Job.

status

$status = Forks::Super::status($pid)

Returns the exit status of a completed child process represented by process ID, job ID, or name attribute. Aside from being a permanent store of the exit status of a job, using this method might be a more reliable indicator of a job's status than checking $? after a "wait" or "waitpid" call, because it is possible for this module's SIGCHLD handler to temporarily corrupt the $? value while it is checking for deceased processes.

MODULE VARIABLES ^

Module variables may be initialized on the use Forks::Super line

    # set max simultaneous procs to 5, allow children to call CORE::fork()
    use Forks::Super MAX_PROC => 5, CHILD_FORK_OK => -1;

or they may be set explicitly in the code:

    $Forks::Super::ON_BUSY = 'queue';
    $Forks::Super::IPC_DIR = "/home/joe/temp-ipc-files";

Module variables that may be of interest include:

MAX_PROC

$Forks::Super::MAX_PROC

The maximum number of simultaneous background processes that can be spawned by Forks::Super. If a fork call is attempted while there are already at least this many active background processes, the behavior of the fork call will be determined by the value in $Forks::Super::ON_BUSY or by the "on_busy" option passed to the fork call.

This value will be ignored during a fork call if the "force" option is passed to fork with a non-zero value. The value might also not be respected if the user supplies a code reference in the "can_launch" option and the user-supplied code does not test whether there are already too many active proceeses.

ON_BUSY

$Forks::Super::ON_BUSY = 'block' | 'fail' | 'queue'

Determines behavior of a fork call when the system is too busy to create another background process.

If this value is set to block, then fork will wait until the system is no longer too busy and then launch the background process. The return value will be a normal process ID value (assuming there was no system error in creating a new process).

If the value is set to fail, the fork call will return immediately without launching the background process. The return value will be -1. A Forks::Super::Job object will not be created.

If the value is set to queue, then the fork call will create a "deferred" job that will be queued and run at a later time. Also see the "queue_priority" option to fork to set the urgency level of a job in case it is deferred. The return value will be a large and negative job ID.

This value will be ignored in favor of an "on_busy" option supplied to the fork call.

CHILD_FORK_OK

$Forks::Super::CHILD_FORK_OK = -1 | 0 | +1

Spawning a child process from another child process with this module has its pitfalls, and this capability is disabled by default: you will get a warning message and the fork() call will fail if you try it.

To override this behavior, set $Forks::Super::CHILD_FORK_OK to a non-zero value. Setting it to a positive value will allow you to use all the functionality of this module from a child process (with the obvious caveat that you cannot wait on the child process of a child process from the main process).

Setting $Forks::Super::CHILD_FORK_OK to a negative value will disable the functionality of this module in child processes but will reenable the Perl builtin fork() system call.

Note that this module will not have any preconceptions about which is the "parent process" until you the first call to Forks::Super::fork. This means it is possible to use Forks::Super functionality in processes that were not spawned by Forks::Super, say, by an explicit CORE::fork() call:

     1: use Forks::Super;
     2: $Forks::Super::CHILD_FORK_OK = 0;
     3:
     4: $child1 = CORE::fork();
     5: if ($child1 == 0) {
     6:    # OK -- child1 is still a valid "parent process"
     7:    $grandchild1 = Forks::Super::fork { ... };
     8:    ...;
     9:    exit;
    10: }
    11: $child2 = Forks::Super::fork();
    12: if ($child2 == 0) {
    13:    # NOT OK - parent of child2 is now "the parent"
    14:    $grandchild2 = Forks::Super::fork { ... };
    15:    ...;
    16:    exit;
    17: }
    18: $child3 = CORE::fork();
    19: if ($child3 == 0) {
    20:    # NOT OK - call in line 11 made parent of child3 "the parent"
    21:    $grandchild3 = Forks::Super::fork { ... };
    22:    ...;
    23:    exit;
    24: }

More specifically, this means it is OK to use the Forks::Super module in a daemon process:

    use Forks::Super;
    $Forks::Super::CHILD_FORK_OK = 0;
    CORE::fork() && exit;
    $daemon_child = Forks::Super::fork();   # ok

DEBUG

$Forks::Super::DEBUG, Forks::Super::DEBUG

To see the internal workings of the Forks::Super module, set $Forks::Super::DEBUG to a non-zero value. Information messages will be written to the Forks::Super::Debug::DEBUG_FH filehandle. By default Forks::Super::Debug::DEBUG_FH is aliased to STDERR, but it may be reset by the module user at any time.

Debugging behavior may be overridden for specific jobs if the "debug" or "undebug" option is provided to fork.

%CHILD_STDxxx

%Forks::Super::CHILD_STDIN
%Forks::Super::CHILD_STDOUT
%Forks::Super::CHILD_STDERR

Deprecated. See Note, below.

In jobs that request access to the child process filehandles, these hash arrays contain filehandles to the standard input and output streams of the child. The filehandles for particular jobs may be looked up in these tables by process ID or job ID for jobs that were deferred.

Remember that from the perspective of the parent process, $Forks::Super::CHILD_STDIN{$pid} is an output filehandle (what you print to this filehandle can be read in the child's STDIN), and $Forks::Super::CHILD_STDOUT{$pid} and $Forks::Super::CHILD_STDERR{$pid} are input filehandles (for reading what the child wrote to STDOUT and STDERR).

As with any asynchronous communication scheme, you should be aware of how to clear the EOF condition on filehandles that are being simultaneously written to and read from by different processes. A construction like this works on most systems:

    # in parent, reading STDOUT of a child
    for (;;) {
        while (<{$Forks::Super::CHILD_STDOUT{$pid}}>) {
          print "Child $pid said: $_";
        }

        # EOF reached, but child may write more to filehandle later.
        sleep 1;
        seek $Forks::Super::CHILD_STDOUT{$pid}, 0, 1;
    }

The Forks::Super::Job object provides the methods write_stdin(@msg), read_stdout(\%options), and read_stderr(\%options) for object oriented read and write operations to and from a child's IPC filehandles. These methods can adjust their behavior based on the type of IPC channel (file, socket, or pipe) or other idiosyncracies of your operating system (#@$%^&*! Windows), so using those methods is preferred to using the filehandles directly.

Note that handles for background process IPC are also available through the Forks::Super::Job object (the return value from Forks::Super::fork), in

    $pid->{child_stdin}
    $pid->{child_stdout}
    $pid->{child_stderr}

This usage should be preferred to $CHILD_STDxxx{...}.

ALL_JOBS

@Forks::Super::ALL_JOBS
%Forks::Super::ALL_JOBS

List of all Forks::Super::Job objects that were created from fork() calls, including deferred and failed jobs. Both process IDs and job IDs for jobs that were deferred at one time) can be used to look up Job objects in the %Forks::Super::ALL_JOBS table.

IPC_DIR

$Forks::Super::IPC_DIR

A directory where temporary files to be shared among processes for interprocess communication (IPC) can be created. If not specified, Forks::Super will try to guess a good directory such as an OS-appropriate temporary directory or your home directory as a suitable store for these files.

$Forks::Super::IPC_DIR is a tied variable and an assignment to it will fail if the RHS is not suitable for use as a temporary IPC file store.

Forks::Super will look for the environment variable IPC_DIR and for an IPC_DIR parameter on module import (that is,

    use Forks::Super IPC_DIR => '/some/directory'

) for suggestions about where to store the IPC files.

Setting this value to "undef" (the string literal "undef", not the Perl special value undef) will disable file-based interprocess communication for your program. The module will fall back to using sockets or pipes (probably sockets) for all IPC. Some features of this distribution may not work or may not work properly if file-based IPC is disabled.

QUEUE_INTERRUPT

$Forks::Super::QUEUE_INTERRUPT

On systems with mostly-working signal frameworks, this module installs a signal handler the first time that a task is deferred. The signal that is trapped is defined in the variable $Forks::Super::QUEUE_INTERRUPT. The default value is USR1, and it may be overridden directly or set on module import

    use Forks::Super QUEUE_INTERRUPT => 'TERM';
    $Forks::Super::QUEUE_INTERRUPT = 'USR2';

You would only worry about resetting this variable if you (including other modules that you import) are making use of an existing SIGUSR1 handler.

Since v0.40 this variable is generally not used unless

1. your system has a POSIX-y signal framework, and

2. Time::HiRes::setitimer is not implemented for your system.

TIMEOUT

Forks::Super::TIMEOUT

A possible return value from "wait" and "waitpid" functions when a timeout argument is supplied. The value indicating a timeout should not collide with any other possible value from those functions, and should be recognizable as not an actual process ID.

    my $pid = wait 10.0;  # Forks::Super::wait with timeout
    if ($pid == Forks::Super::TIMEOUT) {
        # no tasks have finished in the last 10 seconds ...
    } else {
        # task has finished, process id in $pid.
    }

LAST_JOB

$Forks::Super::LAST_JOB_ID
$Forks::Super::LAST_JOB

Calls to the "bg_eval" and "bg_qx" functions launch a background process and set the variables $Forks::Super::LAST_JOB_ID to the job's process ID and $Forks::Super::LAST_JOB to the job's Forks::Super::Job object. These functions do not explicitly return the job id, so these variables provide a convenient way to query that state of the jobs launched by these functions.

Some bash users will immediately recognize the parallels between these variables and the special bash $! variable, which captures the process id of the last job to be run in the background.

WAIT_ACTION_ON_SUSPENDED_JOBS

$Forks::Super::Wait::WAIT_ACTION_ON_SUSPENDED_JOBS

Governs the action of a call to "wait", "waitpid", or "waitall" in the case when all remaining jobs are in the SUSPENDED or DEFERRED-SUSPENDED state (see "state" in Forks::Super::Job). Allowable values for this variable are

wait

Causes the call to "wait"/"waitpid" to block indefinitely until those jobs start and one or more of them is completed. In this case it is presumed that the queue monitor is running periodically and conditions that allow those jobs to get started will occur. This is the default setting for this variable.

fail

Causes the "wait"/"waitpid" call to return with the special (negative) value Forks::Super::Wait::ONLY_SUSPENDED_JOBS_LEFT.

resume

Causes one of the suspended jobs to be resumed. It is presumed that this job will complete and allow the "wait"/"waitpid" function to return.

EXPORTS ^

This module always exports the fork, "wait", "waitpid", and "waitall" functions, overloading the Perl system calls with the same names. Mixing Forks::Super calls with the similarly-named Perl calls is strongly discouraged, but you can access the original builtin functions at CORE::fork, CORE::wait, etc.

Functions that can be exported to the caller's package include

    Forks::Super::bg_eval
    Forks::Super::bg_qx
    Forks::Super::isValidPid
    Forks::Super::open2
    Forks::Super::open3
    Forks::Super::pause
    Forks::Super::read_stderr
    Forks::Super::read_stdout

Module variables that can be exported are:

    %Forks::Super::CHILD_STDIN
    %Forks::Super::CHILD_STDOUT
    %Forks::Super::CHILD_STDERR

The special tag :var will export all three of these hash tables to the calling namespace.

The tag :all will export all the functions and variables listed above.

The Forks::Super::kill function cannot be exported for now, while I think through the implications of overloading yet another Perl system call.

IMPORT CONFIG ^

Many of these settings have been mentioned in other parts of this document, but here is a summary of the configuration that can be done on the use Forks::Super ... line

MAX_PROC => integer

Initializes $Forks::Super::MAX_PROC, which governs the maximum number of simultaneous background processes managed by this module. When a new process is requested and this limit has been reached, the fork call will fail, block (until at least one current process finishes), or queue, depending on the setting of $Forks::Super::ON_BUSY. See "MAX_PROC".

ON_BUSY => 'block' | 'fail' | 'queue'

Sets $Forks::Super::ON_BUSY, which governs the behavior of fork when the limit of simultaneous background processes has been reached. See "ON_BUSY".

CHILD_FORK_OK => -1 | 0 | 1

Sets $Forks::Super::CHILD_FORK_OK, which governs the behavior of Forks::Super::fork when called from a child process. See "CHILD_FORK_OK" in "MODULE VARIABLES".

DEBUG => boolean

Turns module debugging on and off. On the import line, this configuration overrides the value of $ENV{FORKS_SUPER_DEBUG} (see "ENVIRONMENT").

QUEUE_MONITOR_FREQ => num_seconds

Sets $Forks::Super::Queue::QUEUE_MONITOR_FREQ, which governs how frequently the main process should be interrupted to examine the queue of jobs that have not started yet. See Forks::Super::Queue.

QUEUE_INTERRUPT => signal_name

Sets $Forks::Super::QUEUE_INTERRUPT, the name of the signal used by Forks::Super to periodically examine the queue of background jobs that have not started yet. The default setting is USR1, but you should change this if you with to use SIGUSR1 for other purposes in your program. This setting does not have any effect on MSWin32 systems.

IPC_DIR => directory, FH_DIR => directory

Use the specified directory for temporary interprocess communication files used by Forks::Super. Overrides settings of $ENV{IPC_DIR} or $ENV{FH_DIR}.

CONFIG => file, CONFIG_FILE => file

Loads module configuration out of the specified file. The file is expected to contain key-value pairs for the same parameter documented in this section. Parameter names in the configuration file are not case sensitive.

    # sample Forks::Super config file
    max_proc=10
    IPC_DIR=/home/mob/.forks-super-ipc

ENVIRONMENT ^

Forks::Super makes use of the following optional variables from your environment.

FORKS_SUPER_DEBUG

If set, sets the default value of $Forks::Super::DEBUG (see "MODULE VARIABLES") to true.

FORKS_SUPER_QUEUE_DEBUG

If set and true, sends additional information about the status of the queue (see "Deferred processes") to standard output. This setting is independent of the $ENV{FORKS_SUPER_DEBUG}/$Forks::Super::DEBUG setting.

FORKS_DONT_CLEANUP

If set and true, the program will not remove the temporary files used for interprocess communication. This setting can be helpful if you want to analyze the messages that were sent between processes after the fact.

FORKS_SUPER_CONFIG

Forks::Super will probe your system for available functions, Perl modules, and external programs and try suitable workarounds when the desired feature is not available. With $ENV{FORKS_SUPER_CONFIG}, you can command Forks::Super to assume that certain features are available (or are not available) on your system. This is a little bit helpful for testing; I don't know whether it would be helpful for anything else. See the source for Forks/Super/Config.pm for more information about how $ENV{FORKS_SUPER_CONFIG} is used.

FORKS_SUPER_JOB_OVERLOAD

Specifies whether the fork call will return an overloaded Forks::Super::Job object instead of a scalar process identifier. See "OVERLOADING" in Forks::Super::Job. Since v0.41 overloading is enabled by default. If the FORKS_SUPER_JOB_OVERLOAD variable is set, it will override this default.

FORKS_SUPER_ENABLE_DUMP

If set, will invoke "enable_dump" in Forks::Super::Debug and enable a Java Virtual Machine-like feature to report the status of all the background jobs your program has created. If this variable contains the name of a signal, then that signal will be trapped by your program to produce the process dump. If the variable value is not a signal name but is a true value, then the program will produce a process dump in response to a SIGQUIT. See Forks::Super::Debug.

This feature can also be enabled on import of Forks::Super by passing an ENABLE_DUMP parameter on import, like

    use Forks::Super ENABLE_DUMP => 1;    # same as ENABLE_DUMP => 'QUIT'
    use Forks::Super ENABLE_DUMP => 'USR1';
IPC_DIR

Specifies a directory for storing temporary files for interprocess communication. See "IPC_DIR" in "MODULE VARIABLES".

DIAGNOSTICS ^

fork() not allowed in child process ...

When the package variable $Forks::Super::CHILD_FORK_OK is zero, this package does not allow the fork() method to be called from a child process. Set $Forks::Super::CHILD_FORK_OK to change this behavior.

quick timeout

A job was configured with a timeout/expiration time such that the deadline for the job occurred before the job was even launched. The job was killed immediately after it was spawned.

Job start/Job dependency <nnn> for job <nnn> is invalid. Ignoring.

A process id or job id that was specified as a "depend_on" or "depend_start" option did not correspond to a known job.

Job <nnn> reaped before parent initialization.

A child process finished quickly and was reaped by the parent process SIGCHLD handler before the parent process could even finish initializing the job state. The state of the job in the parent process might be unavailable or corrupt for a short time, but eventually it should be all right.

could not open filehandle to provide child STDIN/STDOUT/STDERR
child was not able to detect STDIN file ... Child may not have any input to read.
could not open filehandle to write child STDIN
could not open filehandle to read child STDOUT/STDERR

Initialization of filehandles for a child process failed. The child process will continue, but it will be unable to receive input from the parent through the $Forks::Super::CHILD_STDIN{pid} (pid-{child_stdin}>) filehandle, or pass output to the parent through the filehandles $Forks::Super::CHILD_STDOUT{pid} and $Forks::Super::CHILD_STDERR{pid} (pid-{child_stdout}> and pid-{child_stderr}>).

INCOMPATIBILITIES ^

This module requires its own SIGCHLD handler. Installing other SIGCHLD handlers may cause undefined behavior, though if you are used to setting

    $SIG{CHLD} = 'IGNORE'

in your code, you should still be OK.

DEPENDENCIES ^

The Win32::API module is required for Windows users.

The "bg_eval" function requires at least of the data serialization modules Data::Dumper, YAML, YAML::Tiny, or JSON to be installed. If none of these modules are available, then using "bg_eval" will result in a fatal error.

Otherwise, there are no hard dependencies on non-core modules. Some features, especially operating-system specific functions, depend on some modules (Win32::Process and Win32 for Wintel systems, for example), but the module will compile without those modules. Attempts to use these features without the necessary modules will be silently ignored.

BUGS AND LIMITATIONS ^

Leftover temporary files and directories

In programs that use the interprocess communication features, the module will usually but not always do a good job of cleaning up after itself. You may find directories called .fhfork<nnn> that may or not be empty scattered around your filesystem.

You can invoke this module as one of:

    $ perl -MForks::Super=cleanse
    $ perl -MForks::Super=cleanse,<directory>

to run a function that will clean up these directories.

Interrupted system calls

A typical script using this module will have a lot of behind-the-scenes signal handling as child processes finish and are reaped. These frequent interruptions can affect the execution of the rest of your program. For example, in this script:

    1: use Forks::Super;
    2: fork(sub => sub { sleep 2 });
    3: sleep 5;
    4: # ... program continues ...

the sleep call in line 3 is probably going to get interrupted before 5 seconds have elapsed as the end of the child process spawned in line 2 will interrupt execution and invoke the SIGCHLD handler. In some cases there are tedious workarounds:

    3a: $stop_sleeping_at = time + 5;
    3b: sleep 1 while time < $stop_sleeping_at;

In this distribution, the Forks::Super::pause call provides an interruption-resistant alternative to sleep.

    3: Forks::Super::pause(5);

The pause call itself has the limitation that it may sleep for longer than the desired time. This is because the "productive" code executed in a pause function call can take an arbitrarily long time to run.

Idiosyncratic behavior on some systems

The system implementation of fork'ing and wait'ing varies from platform to platform. This module has been extensively tested on Cygwin, Windows, and Linux, but less so on other systems. It is possible that some features will not work as advertised. Please report any problems you encounter to <mob@cpan.org> and I'll see what I can do about it.

Other bugs or feature requests

Feel free to report other bugs or feature requests to bug-forks-super at rt.cpan.org or through the web interface at http://rt.cpan.org/NoAuth/ReportBug.html?Queue=Forks-Super. This includes any cases where you think the documentation might not be keeping up with the development. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.

SEE ALSO ^

There are reams of other modules on CPAN for managing background processes. See Parallel::*, Proc::Parallel, Proc::Fork, Proc::Launcher. Also Win32::Job.

Inspiration for "bg_eval" function from Acme::Fork::Lazy.

AUTHOR ^

Marty O'Brien, <mob@cpan.org>

LICENSE AND COPYRIGHT ^

Copyright (c) 2009-2013, Marty O'Brien.

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.8 or, at your option, any later version of Perl 5 you may have available.

See http://dev.perl.org/licenses/ for more information.

syntax highlighting: