The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.
Newsgroups: comp.lang.perl.modules
Subject: Best form for allowing module extension?
Reply-To: 
Followup-To: 
Keywords: 
Summary: 

I am writing a module (Archive::Zip) that implements the basic read and
write functionality for Zip archive files. These files have provisions
for extensions for specific platforms: each member has an 'extra field'
that can contain OS-specific (or, indeed, any member-specific) data. The
overall format of this data is specified (<Header ID>, <count>, <data>),
but the actual contents depends on the Header ID.

Because I'm only working in a couple of operating environments, and
because I'm not trying to write a full "unzip" or "PKZIP" replacement, I
don't really want to try to interpret all of these formats.

From the PKWARE Appnote.txt file:

         The current Header ID mappings defined by PKWARE are:

          0x0007        AV Info
          0x0009        OS/2
          0x000a        NTFS 
          0x000c        VAX/VMS
          0x000d        Unix
          0x000f        Patch Descriptor

          Several third party mappings commonly used are:

          0x4b46        FWKCS MD5 (see below)
          0x07c8        Macintosh
          0x4341        Acorn/SparkFS 
          0x4453        Windows NT security descriptor (binary ACL)
          0x4704        VM/CMS
          0x470f        MVS
          0x4c41        OS/2 access control list (text ACL)
          0x4d49        Info-ZIP VMS (VAX or Alpha)
          0x5455        extended timestamp
          0x5855        Info-ZIP Unix (original, also OS/2, NT, etc)
          0x6542        BeOS/BeBox
          0x756e        ASi Unix
          0x7855        Info-ZIP Unix (new)
          0xfd4a        SMS/QDOS

I want to make it easy for other people to provide this support without
changing my code.

Note that not all of these extensions have anything to do with file
permissions, although it may be helpful to provide one or more hooks for
extracting files:

	* supply OS-specific filename
	* open file for write (set permissions)
	* after closing file (to set ownership, timestamps, etc.)

I can provide generic support for these extra fields, so that each
member can have 0 or more extra fields, each with a type tag and
uninterpreted data.

I have seen File::Spec and File::Spec::Unix, etc., and don't think that
this scheme is appropriate, since you could have a zip file that was
produced on one operating system being extracted by another.

Also, it is possible to have multiple types of extra fields in a single
zip file.

What I have thought about is this: a user who wants to interpret the
extended information in the zip members can include the appropriate
extension modules:

# ==================== in user's code ==================== 
use Archive::Zip;	# basic functionality
use Archive::Zip::Unix;	# to interpret Unix file permissions, etc.
use Archive::Zip::MD5;	# to interpret MD5 extended info

my $zip = Archive::Zip->new();
$zip->read('ZIPFILE.ZIP');
foreach my $member ($zip->members())
{
	foreach my $extraField ($member->extraFields())
	{
		print $extraField->info() . "\n";
	}

	$member->extract();
}
# ==================== end user's code ====================

I can make an extensible class for writers of OS-specific modules to
inherit from:

# ==================== in my code ==================== 
package Archive::Zip::ExtraField;
my %Handlers;

# Each subclass must call this with their class name and tag ID.
sub registerType
{
	my ($class, $tag) = @_;
	$Handlers{ $tag } = $class;
}

# Overrideable methods
sub info 
{
	my $self = shift;
	ref($self) . " " . $self->{tag} . " " . $self->{dataLength};
}

# Provide OS-specific name if any or undef
sub preferredFileName { undef }

# Returns numeric arg for open() call or undef
sub openPermissions { undef }

# Hook for doing things after file is extracted
# Called as: $extraField->afterClosingExtractedFile($fileName)
sub afterClosingExtractedFile { }

package Archive::Zip::Member;

# return array of extra fields
sub extraFields() { ... }

sub extract
{
	my $self = shift;
	my ($preferredFileName) = 
		grep { $_ }
		(map { $_->preferredFileName() } $self->extraFields());
	my $fileName = $preferredFileName || $self->fileName();
	# ... similar things for open permissions ...
	my $fh = FileHandle->new($fileName, $openPermissions);
	# ... extract data to fh ...
	$fh->close();
	map { $_->afterClosingExtractedFile($fileName) }
		$self->extraFields();
}
# ==================== end my code ====================


Does this seem like a good way to go? Any other suggestions?

-- 
Ned Konz
currently: Stanwood, WA
email:     ned@bike-nomad.com
homepage:  http://www.bike-nomad.com