Mark Overmeer > HTTP-Server-Multiplex > HTTP::Server::Multiplex

Download:
HTTP-Server-Multiplex-0.11.tar.gz

Dependencies

Annotate this POD

CPAN RT

Open  0
View/Report Bugs
Module Version: 0.11   Source  

NAME ^

HTTP::Server::Multiplex - single process multi serve HTTP daemon

INHERITANCE ^

SYNOPSIS ^

  # see examples directory in the distribution!
  use HTTP::Server::Multiplex;
  my $daemon = HTTP::Server::Multiplex->new
    ( daemon     => {}
    , connection => {}
    , vhosts     => {}
    );

  $daemon->addVirtualHost('My::VHost');

  $daemon->run;

DESCRIPTION ^

This full stand-alone HTTP daemon serves all requests from one single process. See the "DETAILS" about the back-ground, advantages and backdraws of this approach. You should read that section at least once before you start.

This is the first release: please test and report bugs and wish-lists!

METHODS ^

$obj->_configDaemon(OPTIONS)

 Option   --Default
 detach     <false>
 group      $EGID
 pid_file   <undef>
 server_id  <shows hostname and some versions>
 user       $ENV{USER} or $EUID

. detach => BOOLEAN

When true, the daemon will detach from the terminal and run in the background. STDIN, STDOUT, STDERR will get closed; errors go to syslog via Log::Report subroutine dispatcher.

. group => STRING

The group(s) under which the daemon will be run.

. pid_file => FILENAME

. server_id => STRING

. user => STRING

The username under which the daemon will be run.

If you are the super-user and you want this daemon to run as such (which is dangerous!), then you must explicitly provide the name of the super-user username.

$obj->_configNetwork(SOCKET|HASH-of-OPTIONS)

Set-up a listener waiting for new connections to arrive. This is an internal helper routine for new().

 Option--Default
 host    <all interfaces>
 listen  SO_MAXCONN
 port    80

. host => STRING

Needed to get the interface IP addresses from.

. listen => INTEGER

. port => INTEGER

HTTP::Server::Multiplex->new(OPTIONS)

 Option    --Default
 connection  <required>
 deamon      <required>
 vhosts      <HTTP::Server::VirtualHost::Default>

. connection => SOCKET|HASH-of-OPTIONS|ARRAY

For OPTIONS, see _configNetwork(). You can provide an ARRAY of SOCKETs or socket configuration OPTIONS, which will be handled in parallel.

. deamon => HASH-of-OPTIONS

For OPTIONS, see _configDaemon()

. vhosts => VHOST|HASH-of-OPTIONS|PACKAGE|ARRAY

For OPTIONS, see addVirtualHost(). Provide one or an ARRAY of virtual host configurations, either by HTTP::Server::VirtualHost objects or by the OPTIONS to create such objects.

Accessors

$obj->mux

Returns the core multiplexer object, an IO::Multiplex.

Daemon control

$obj->run

Start the daemon.

Virtual host administration

$obj->addVirtualHost(VHOST|PACKAGE|HASH-of-OPTIONS|OPTIONS)

Adds a new virtual host to the knowledge of the daemon. Can be used at run-time. See HTTP::Server::VirtualHost::new() for OPTIONS. The added virtual host object is returned.

$obj->removeVirtualHost(VHOST|NAME|ALIAS)

Remove all name and alias registrations for the indicated virtual host. Silently ignores non-existing vhosts. The removed virtual host object is returned.

$obj->virtualHost(NAME)

DETAILS ^

HTTP::Daemon/mod_perl differ from this approach

The LWP network library is very solid and provides a full HTTP/1.1 daemon named HTTP::Daemon. The logic of that daemon was used to create the code for this module. Both LWP and Apache's mod_perl start many processes, for each requesting client one. What we do here is to have only one process to serve many clients at the same time.

HTTP::Daemon and mod_perl are based on processes and threads to handle requests in parallel. The advantage is that disturbances and delays in handling one client's request are not brothering the other processes, the other clients. As disadvantage, it quite hard to share user session information and do caching. Solutions are found in databases and the mem-cache daemon to share data, and use locking to synchronize.

This HTTP::Server::Multiplex module uses only one process to handle all requests, by serving many client connections together. The base is laid in IO::Multiplex, which is a smart select(2) call (you may need to read-up on that Operating System feature). The single process spends all its time handling one request, until IO has to be done. When waiting for that IO to happen, it will handle available request from other connections.

Advantages of this approach: no heavy forking, no complex synchronizing between processes and very simple caching. Most importantly: the whole code is a critical section which avoid the need for locking. Very fast and very simple -in concept.

Disadvantages are also plentifold: you have to be careful with any IO to use the IO::Multiplex select loop, and busy-waits (for reading files, acquiring locks, database output, hostname lookups, sleep) are blocking all other clients. Any bug in your software can break the daemon, and therewith the whole service.

Features

The following common needs for http servers are implemented:

HTTP/1.1 (and 1.0 and older)

Multiple requests can use one connection. Requests can arrive asynchronously from the processing, queuing-up while processed in order.

Virtual Hosts

Configure multiple websites, handled by this single daemon. See HTTP::Server::VirtualHost. Each website (virtual host) has its own directories where the information is taken from, each with a set access restrictions (allow/deny), rewrite rules, location, etc. See HTTP::Server::Directory

Critical section

With forking servers, it is difficult to synchronize between threads and processes: you have to lock or use databases to create critical sections. In this implementation, the whole program runs in one big critical section, unless you do IO. Of course, each processing should take little time, to avoid long response delays for other connections.

Asynchronous processing

In case you task does difficult I/O or long computation, you can start a seperate process with HTTP::Server::Connection::async(). Be aware that this comes with a penalty.

Sessions

The same user can have multiple connections to the daemon, which use a single session definition. This way, the cache of the user's information cannot get out-of-sync. See HTTP::Server::Session.

Missing features

CGI

Asynchronous execution of external scripts, according to the CGI protocol. This is not too hard to implement, it just takes some time.

Microsoft Windows Limitations

Microsofts POSIX implementation does support the select() call, which is the base of IO::Multiplex used as event loop in the daemon. However, that implementation does not support the use of FILE and PIPE handles in the select() as seen on UNIX systems: it only supports sockets. As a result, the readFile, writeFile and async methods of HTTP::Server::Connection will (probably) not work. Please, users of Windows: explain me how to implement work-arounds.

General Misconceptions

People apparently think that executing requests in parallel processes is faster than executing all requests in one process; this cannot be true (in a single processor system): processing takes computational effort and there is only one processor to run these tasks. Parallel processes do not run in parallel, but interleaved. It is the same amount of work to be done, so runs in comparible total time.

The most important difference between the one-process and the parallel implementation, is that the latter can sleep much easier: waiting for disk, interrupts and such in one parallel thread does not hold-up the other threads. In the one-process implementation, we need to code just like graphical interfaces work: in small fragments, triggered by (file IO) events.

When you implement your "CGI" logic within the virtual host frame-work of this daemon, your code gets compiled before the daemon starts accepting requests. The same, of course, for mod_perl. Once running, it is quite hard to fill a time-slice of availability of the processor; the OS interleaves parallel processes with rather long time-spans of (usually) around on tenth of a second. You can do an enormous amount of work in such a long time-slice. Most tasks, however, will not consume the whole span, because they have to wait for more data from disk, some event, or are completed. Concluding that an IO-driven approach is rarely giving slower "interactive" response to clients than a parallel implementation. At least in theory, when implemented correctly.

SEE ALSO ^

This module is part of HTTP-Server-Multiplex distribution version 0.11, built on October 01, 2008. Website: http://perl.overmeer.net/httpd-multiplex/

LICENSE ^

Copyrights 2008 by Mark Overmeer. For other contributors see ChangeLog.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself. See http://www.perl.com/perl/misc/Artistic.html

syntax highlighting: