NAME
Fuse::PDF - Filesystem embedded in a PDF document
SYNOPSIS
use Fuse::PDF;
my $fs = Fuse::PDF->new('my_doc.pdf');
$fs->mount('/mnt/pdf');
# blocks until the filesystem is unmounted
See also the mount_pdf front-end.
LICENSE
Copyright 2007-2008 Chris Dolan, *cdolan@cpan.org*
This library is free software; you can redistribute it and/or modify it
under the same terms as Perl itself.
DESCRIPTION
The Adobe Portable Document Format is an arbitrary collection of nodes
which support a tree structure. Most of that data is oriented toward
document rendering, but it's legal to add arbitrarily complex data
virtually anywhere in the document structure. Adobe Illustrator does
this to embed lots of metadata in it's "PDF-compatible" Illustrator
document format.
By deciding on a convention for representing a filesystem data and
leveraging the FUSE (Filesystem in Userspace) library, we map filesystem
calls to PDF edits.
More info: http://www.chrisdolan.net/madmongers/par-fuse-pdf.html
BUGS AND CAVEATS
PDF-in-PDF: If you copy another PDF into the PDF-based filesystem, it
may corrupt the outer document. This should be solved when I switch to
saving file contents in PDF streams instead of in PDF strings.
Saving: No data is saved until you unmount the filesystem! Hopefully I
can fix this in future releases. The saving is not yet atomic. That is,
if you have a failure, the old PDF may be deleted before the new one is
saved.
Resources: The entire PDF is loaded into RAM in `new()'. If your
filesystem grows too large, this will lead to obvious problems!
Hangs: While FUSE is quite mature, I found it to be fairly easy to hang
the filesystem back around 0.01. I only needed to actually reboot once,
but if that causes you concern you may wish to avoid FUSE in general.
This has not been a problem since the earliest releases.
Operating systems: I've only tested this software with the Google build
of MacFUSE 1.1.0 (PowerPC, 10.4, http://code.google.com/p/macfuse/).
Notably, I have not tried the Linux implementation of FUSE. If you have
other experiences to add, email me or post comments to
http://annocpan.org/.
Fuse.pm: As of this writing, the Fuse module (v0.09_01) fails all tests
on Mac. The module actually *works* great, but the Makefile.PL and the
tests are very Linux-centric. Hopefully that will improve as MacFUSE
matures.
PDF versions: This package relies on CAM::PDF to read and write PDFs.
While that module supports all of the core PDF syntax, it's stricter
than many other PDF implementations and may fail to open PDFs that, say,
Acrobat or Preview.app can open. In particular, "Print to PDF" on Mac OS
X 10.4 often generates bad PDFs.
Threading: I've explicitly set the FUSE default to single-threaded mode,
so performance may be terrible in some scenarios. I hope to add support
for threaded Perl in a future release. Patches welcome (remove the
`threaded => 0' line from this file and add locking to Fuse::PDF::FS).
Unsupported: special files (named pipes, etc.), following symlinks out
of the filesystem, permission enforcement, `chown', `flush', reading
from unlinked filehandles.
Hard links: I have not yet implemented hard links. I'll implement
compressed streams at the same time.
METHODS
$pkg->new($pdf_filename)
$pkg->new($pdf_filename, $hash_of_options)
Create a new filesystem instance. This method opens and parses the
PDF document. If there is an error opening or parsing the PDF
document, this will return `undef'.
The options hash supports the following extra arguments:
pdf_constructor
An arrayref of extra arguments to pass to the CAM::PDF
constructor. In particular, the first arguments are the owner
and user password which can be used to open encrypted PDFs.
save_filename
The string representing the path where filesystem changes should
be saved. By default this is the `$pdf_filename' passed to
`new()'.
compact
A boolean indicating whether to discard old filesystem data
saved via version infrastructure described in the PDF
specification. Defaults to false. If left false, then the PDF
will grow with every mount, but only by as much as you changed
it. See `rewritepdf.pl --cleanse' from the CAM::PDF distribution
to perform the compaction manually. See also `revertpdf.pl' to
roll back to those older versions.
fs_name
Fuse::PDF can embed multiple filesystems in a single PDF
distinguished by name. This string specifies which filesystem to
use. It uses the Fuse::PDF::FS default if a name is not
explicitly provided.
revision
A version number indicating which filesystem version to roll
back to before mounting. Use `fs()' and the Fuse::PDF::FS API to
learn what revisions are available in a PDF filesystem.
$self->mount($mount_path)
$self->mount($mount_path, $hash_of_fuse_options)
Calls into Fuse to mount the filesystem to the specified mount
point. On unmount, a new PDF will be saved with any filesystem
changes.
If the mount point does not exist, this package will try to create
it as a directory via a simple `mkdir()'. If the `mkdir()' fails, or
if the mount point exists but is not a directory, `mount()' will
`croak()'.
If the PDF has an existing filesystem which is incompatible with
this version of the software, `mount()' will `croak()'.
If the mount is successful, we establish callbacks and hand control
to the FUSE library. FUSE blocks until the filesystem is unmounted.
If this blocking is a problem for you, consider daemonizing the
process like so:
use Fuse::PDF;
use Net::Server::Daemonize qw();
my ($pdffilename, $mountdir) = @ARGV;
my $fs = Fuse::PDF->new($pdffilename);
if (Net::Server::Daemonize::safe_fork()) {
exit 0; # parent process or failure
}
Net::Server::Daemonize::daemonize('www', 'www');
$fs->mount($mountdir);
The `mount()' method cleans up after itself sufficiently that you
may call it again immediately after unmounting.
The options hash is passed directly to `Fuse::main()'. See the Fuse
documentation for the allowed keys. A simple example is:
$fs->mount($mountdir, {debug => 1});
$self->fs()
Return a fresh copy of the Fuse::PDF::FS data structure representing
this PDF. You should not try to manipulate this object while the
filesystem is mounted. This module is not yet thread-safe!
SEE ALSO
CAM::PDF
Fuse
mount_pdf
AUTHOR
Chris Dolan, *cdolan@cpan.org*
CREDITS
Thanks to the Madison Perl Mongers for thinking the idea was stupid
enough that I was inspired to implement it!