NAME
Proc::FastSpawn - fork+exec, or spawn, a subprocess as quickly as
possible
SYNOPSIS
use Proc::FastSpawn;
# simple use
my $pid = spawn "/bin/echo", ["echo", "hello, world"];
...
waitpid $pid, 0;
# with environment
my $pid = spawn "/bin/echo", ["echo", "hello, world"], ["PATH=/bin", "HOME=/tmp"];
# inheriting file descriptors
pipe R, W or die;
fd_inherit fileno W;
my $pid = spawn "/bin/sh", ["sh", "-c", "echo a pipe >&" . fileno W];
close W;
print <R>;
DESCRIPTION
The purpose of this small (in scope and footprint) module is simple:
spawn a subprocess asynchronously as efficiently and/or fast as
possible. Basically the same as calling fork+exec (on POSIX), but
hopefully faster than those two syscalls.
Apart from fork overhead, this module also allows you to fork+exec
programs when otherwise you couldn't - for example, when you use POSIX
threads in your perl process then it generally isn't safe to call fork
from perl, but it is safe to use this module to execute external
processes.
If neither of these are problems for you, you can safely ignore this
module.
So when is fork+exec not fast enough, how can you do it faster, and why
would it matter?
Forking a process requires making a complete copy of a process. Even
thought almost every implementation only copies page tables and not the
memory itself, this is still not free. For example, on my 3.6GHz amd64
box, I can fork a 5GB process only twenty times a second. For a
real-time process that must meet stricter deadlines, this is too slow.
For a busy and big web server, starting CGI scripts might mean
unacceptable overhead.
A workaround is to use "vfork" - this function isn't very portable, but
it avoids the memory copy that "fork" has to do. Some systems have an
optimised implementation of "spawn", and some systems have nothing.
This module tries to abstract these differences away.
As for what improvements to expect - on the 3.6GHz amd64 box that this
module was originally developed on, a 3MB perl process (basically just
perl + Proc::FastSpawn) takes 3.6s to run /bin/true 10000 times using
fork+exec, and only 2.6s when using vfork+exec. In a 22MB process, the
difference is already 5.0s vs 2.6s, and so on.
FUNCTIONS
All the following functions are currently exported by default.
$pid = spawn $path, \@argv[, \@envp]
Creates a new process and tries to make it execute $path, with the
given arguments and optionally the given environment variables,
similar to calling fork + execv, or execve.
Returns the PID of the new process if successful. On any error,
"undef" is currently returned. Failure to execution might or might
not be reported as "undef", or via a subprocess exit status of 127.
$pid = spawnp $file, \@argv[, \@envp]
Like "spawn", but searches $file in $ENV{PATH} like the shell would
do.
fd_inherit $fileno[, $on]
File descriptors can be inherited by the spawned processes or not.
This is decided on a per file descriptor basis. This module does
nothing to any preexisting handles, but with this call, you can
change the state of a single file descriptor to either be inherited
($on is true or missing) or not $on is false).
Free portability pro-tip: it seems native win32 perls ignore $^F and
set all file handles to be inherited by default - but this function
can switch it off.
PORTABILITY NOTES
On POSIX systems, this module currently calls vfork+exec, spawn, or
fork+exec, depending on the platform. If your platform has a good vfork
or spawn but is misdetected and falls back to slow fork+exec, drop me a
note.
On win32, the "_spawn" family of functions is used, and the module tries
hard to patch the new process into perl's internal pid table, so the pid
returned should work with other Perl functions such as waitpid. Also,
win32 doesn't have a meaningful way to quote arguments containing
"special" characters, so this module tries it's best to quote those
strings itself. Other typical platform limitations (such as being able
to only have 64 or so subprocesses) are not worked around.
AUTHOR
Marc Lehmann <schmorp@schmorp.de>
http://home.schmorp.de/