The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.
Implementation of the archive feature in Net::FTPServer
-------------------------------------------------------
Richard Jones, 14th Sept 2001.

1. Background
-------------

Suppose the VFS contains a file and a directory called:

drwxr-xr-x    4 rich     users        4096 Aug 26 14:05 dir
-rw-r--r--    1 rich     users       28618 Sep  1 11:46 file

When the administrator enables archive mode, users may request
any of:

RETR file		# Get the file in the normal way
RETR file.gz		# Get the file gzip compressed
RETR file.bz2		# Get the file bzip2 compressed
RETR file.uue		# Get the file uuencoded
RETR dir.tar		# Get the contents of the directory, in a tar archive
RETR dir.zip		# Ditto, as a DOS ZIP file.
RETR dir.tar.gz		# Ditto, as a gzip-compressed tar archive
&c.

The implementation, described below, is transparent to the
underlying VFS (and hence works correctly with database-backed
filesystems and so on).

2. Filters and generators
-------------------------

A _filter_ is a compression or encoding program which can be applied
to a file or a simple archive. Thus when "file.gz" is requested, a
single filter (the external "gzip" program) is applied to the file.
Most filters are implemented using external programs, thus allowing
Net::FTPServer to support a wide range of compression formats and
encodings (eg. uuencode).

A _generator_ is a piece of code which creates an archive from a
directory. Thus when "dir.tar" is requested, the tar generator is
invoked which recurses over the directory structure and creates a tar
file. Zip is also a generator. Generators only work on directories and
are generally implemented using code internal to the FTP server.

Up to one generator and zero or more filters may be used during any
download.  Thus "RETR dir.tar.gz" invokes both the tar generator and
the gzip filter. "RETR dir.tar" invokes the tar generator and no
filters. "RETR dir.tar.gz.uue" invokes the tar generator and two
filters.

3. Modifications to RETR
------------------------

The implementation of the RETR command has been modified substantially
to make archive mode work. The command now performs the following
steps:

* Attempt to find the requested filename.

* If the requested filename is not found, see if the name matches any
  filter extension, and if so, remove filter extensions one at a time
  until either a file is found on disk matching the shortened name,
  or else a directory is found on disk matching the shortened name plus
  a valid generator extension.

    For example: RETR dir.tar.gz is requested. "dir.tar.gz" doesn't
    exist. "gz" is a filter extension. "dir.tar" doesn't exist as a
    file, but "dir" exists as a directory and "tar" is a valid
    generator extension.

* Open data socket back to the client.

* For each extension (there may be zero or more extensions found in
  the second step) invoke the external filter program, dupping the
  socket file descriptor.

    After this step, we end up with a chain of external filter programs
    like this:

    "$sock" in		 +--------+	  +----------+       socket
    _RETR_command  ===>  | gzip   |  ==>  | uuencode |  -->  to client
			 +--------+	  +----------+

    The ==> arrows are Unix pipes. --> is an AF_INET socket.

* If the source is a simple file, then begin the transfer. However, if
  the source is a directory + generator program, then the generator is
  invoked which returns a fake $io object which is used instead of a file.
  Other than that, the transfer continues as normal.