The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

File::VirtualPath - Portable abstraction of a file/dir/url path

DEPENDENCIES

Perl Version

        5.004

Standard Modules

        I<none>

Nonstandard Modules

        I<none>

SYNOPSIS

Content of thin shell "startup.pl":

        #!/usr/bin/perl
        use strict;
        use warnings;

        my $root = "/home/johndoe/projects/aardvark";
        my $separator = "/";
        if( $^O =~ /Win/i ) {
                $root = "c:\\projects\\aardvark";
                $separator = "\\";
        }
        if( $^O =~ /Mac/i ) {
                $root = "Documents:projects:aardvark";
                $separator = ":";
        }

        use Aardvark;
        Aardvark->main( File::VirtualPath->new( $root, $separator ) );

        1;

Content of fat main program "Aardvark.pm"

        package Aardvark;
        use strict;
        use warnings;
        use File::VirtualPath;

        sub main {
                my (undef, $project_dir) = @_;
                my $prefs = &get_prefs( $project_dir->child_path_obj( 'config.pl' ) );
                &do_work( $prefs, $project_dir );
        }

        sub get_prefs {
                my ($project_dir) = @_;
                my $real_filename = $project_dir->physical_path_string();
                my $prefs = do $real_filename;
                defined( $prefs ) or do {
                        my $virtual_fn = $project_dir->path_string();
                        die "Can't get Aardvark prefs from file '$virtual_fn': $!";
                };
                return( $prefs );
        }

        sub do_work {
                my ($prefs, $project_dir) = @_;
                my ($lbl_a, $lbl_b, $lbl_c) = ($prefs->{a}, $prefs->{b}, $prefs->{c});
                my $data_source = $prefs->{'sourcefile'};
                open( SOURCE, $project_dir->physical_child_path_string( $data_source ) );
                while( my $line = <SOURCE> ) {
                        my ($a, $b, $c) = split( "\t", $line );
                        print "File contains: $lbl_a='$a', $lbl_b='$b', $lbl_c='$c'\n";
                }
                close( SOURCE );
        }

        1;

Content of settings file "config.pl"

        $rh_prefs = {
                sourcefile => 'mydata.txt',
                a => 'name',
                b => 'phone',
                c => 'date',
        };

DESCRIPTION

This Perl 5 object class implements a portable abstraction of a resource path, examples of which include file-system paths like "/usr/bin/perl" and URLs like "http://www.cpan.org/modules/". It is designed to support applications that are easily portable across systems because common platform-specific details are abstracted away. Abstracted details include the location of your project within the file-system and the path separator for your OS; you can write your application as if it is in the root directory of a UNIX system, and it will function correctly when moved to any subdirectory or to a Mac or Windows system.

OVERVIEW

This class is implemented as a simple data structure which stores an array of path segments such as ['', 'usr', 'bin', 'perl'] in a virtual file-system. The majority part of your application works with these objects and passes them around during its routines of locating config or data or other files.

As your application navigates the virtual file-system, it uses object methods like chdir() to tell the object where the app thinks it is now. When your program actually needs to use files, it asks a method like physical_path_string() to give it a string representing the current path in the real world, which it then passes to your standard I/O functions like open().

For example, the program may think it is sitting in "/config/access", but it actually makes an open call to "/home/johndoe/projects/aardvark/config/access". If you move the "aardvark" project to a Windows system, the real path may have changed to "c:\projects\aardvark\config\access", but your program would never need to know the difference (aside from any internal file format issues).

In order for this to work, a small part of your program needs to know the truth of where the project it is working on is located. But that part can be a very lightweight shim which initializes a single File::VirtualPath object and then passes it to the fat portable part of the program. There are two bits of data that your shim needs to provide: 1. A string having the full real-world path of your project root directory; 2. A string having the real-world path separator. See the SYNOPSIS for an example.

Then, your main program just needs to assume that the argument it was passed is currently in the virtual root directory and go from there.

        THIN CONFIG SHELL <----> File::VirtualPath <----> FAT PROGRAM CORE
        (may be portable)        (portable)               (portable)

Taking this idea further, it is easy for program code to be reused for multiple projects, simultaneously, because each would only need a different thin shim program which points to a different physical directory as the virtual root.

Taking this idea further, File::VirtualPath makes it easier for you to separate your application into components that have their own files to keep track of. When your main program calls a component, it can pass a modified FVP object which that component uses as its own virtual root. And so you can have multiple instances of program components each working in different directories, and no logic for working this out needs to be in the components themselves.

On a final note, the paths returned by this class are all absolute. Therefore you never need to do a real "chdir" or "cd" operation in your program, and your executable doesn't have to be located in the same place as its data. This is particularly useful if you are calling your program using a link/alias/shortcut.

SYNTAX

This class does not export any functions or methods, so you need to call them using object notation. This means using Class->function() for functions and $object->method() for methods. If you are inheriting this class for your own modules, then that often means something like $self->method().

Paths can be represented as either strings or array refs, and any methods which take absolute or relative paths as arguments can take either format. A literal list will not work. Methods which return paths usually come in pairs, and their names differ only in that one has a "_string" suffix; each will return either an array ref or a string. Literal lists are never returned, even in list context.

A path is "absolute" when its array representation has an empty string as its first element, or its string representation begins with a "/". Note that a simple split or join operation on "/" will cleanly convert one into the other. Conversely, a path is "relative" when its array representation has anything but an empty string (or undef) in its first element, or its string representation does not start with a "/".

In the virtual file-system that objects of this class represent, the root directory is called "/" and path separators are also "/"; this is just like UNIX. String representations of the virtual path are split or joined on the same "/". For your convenience, the path_delimiter() method lets you change the string that has these dual purposes.

Whenever you see any CHANGE_VECTOR arguments mentioned below, realize that they can be either absolute or relative paths. The effects of using either is the same as with your normal "chdir" or "cd" functions. If CHANGE_VECTOR is an absolute path then the entire path becomes it; whereas, if that argument is a relative path then it is applied to the current absolute path and a new absolute path results. Usual conventions have alphanumeric path segments going down one directory level, ".." segments going up one level, and "." not going anywhere.

If an absolute path is taken as an argument or derived from a relative path, it is always reduced to its simplest form before being stored or returned. Mainly this ensures that there are no ".." or "." remaining in the path. Any ".." path segments are paired up with previous alphanumeric list elements; these negate each other and both are removed. If any ".." can not be paired up then they are simply removed since you can not navigate higher than the root; note that this would only happen if we are passed a malformed argument. This precaution can also act as a pseudo-security measure by never returning a physical path that is outside the virtual root.

FUNCTIONS AND METHODS

new([ PHY_ROOT[, PHY_DELIM[, VIR_DELIM[, VIR_PATH]]] ])

This function creates a new File::VirtualPath (or subclass) object and returns it. All of the method arguments are passed to initialize() as is; please see the POD for that method for an explanation of them.

initialize([ PHY_ROOT[, PHY_DELIM[, VIR_DELIM[, VIR_PATH]]] ])

This method is used by new() to set the initial properties of objects that it creates. The 4 optional arguments allow you to set the default values for the four object properties that the following methods also handle: physical_root(), physical_delimiter(), path_delimiter, path(). Semantecs are the same as calling those 4 methods yourself in the same order.

clone([ CLONE ])

This method initializes a new object to have all of the same properties of the current object and returns it. This new object can be provided in the optional argument CLONE (if CLONE is an object of the same class as the current object); otherwise, a brand new object of the current class is used. Only object properties recognized by File::VirtualPath are set in the clone; other properties are not changed.

physical_root([ NEW_VALUE ])

This method is an accessor for the scalar "physical root" property of this object, which it returns. If NEW_VALUE is defined, this property is set to it. This property defines what path on the real file-system the virtual root corresponds to. This property defaults to an empty string. This property must not have any trailing delimiter like "/".

physical_delimiter([ NEW_VALUE ])

This method is an accessor for the scalar "physical delimiter" property of this object, which it returns. If NEW_VALUE is defined, this property is set to it. This property defines what the path delimiter in the real file-system is. This property defaults to "/", which is the UNIX standard.

path_delimiter([ NEW_VALUE ])

This method is an accessor for the scalar "path delimiter" property of this object, which it returns. If NEW_VALUE is defined, this property is set to it. This property defines what the path delimiter in the virtual file-system is. This property defaults to "/", which is the UNIX standard.

path([ NEW_VALUE ])

This method is an accessor for the array-ref "path" property of this object, which it returns. If NEW_VALUE is defined, this property is set to it. This property defines what absolute path in the virtual file-system this object represents. This property defaults to the virtual root.

child_path( CHANGE_VECTOR )

This method uses CHANGE_VECTOR to derive a new path relative to what this object represents and returns it as an array-ref.

child_path_obj( CHANGE_VECTOR )

This method uses CHANGE_VECTOR to derive a new path relative to what this object represents and uses it as the "path" attribute of a new object of this class, which it returns. All other attributes of the new object are cloned.

chdir( CHANGE_VECTOR )

This method uses CHANGE_VECTOR to derive a new path relative to what this object represents and then changes this object to represent the new path. The effect is conceptually the same as using "chdir" to change your current working directory where this object represents such.

path_string([ WANT_TRAILER ])

This method returns the absolute path on the virtual file-system that this object represents as a string. If WANT_TRAILER is true then the string has a path delimiter appended; otherwise, there is none.

physical_path_string([ WANT_TRAILER ])

This method returns the absolute path on the real file-system that this object represents as a string. If WANT_TRAILER is true then the string has a path delimiter appended; otherwise, there is none.

child_path_string( CHANGE_VECTOR[, WANT_TRAILER] )

This method uses CHANGE_VECTOR to derive a new path in the virtual file-system relative to what this object represents and returns it as a string. If WANT_TRAILER is true then the string has a path delimiter appended; otherwise, there is none.

physical_child_path_string( CHANGE_VECTOR[, WANT_TRAILER] )

This method uses CHANGE_VECTOR to derive a new path in the real file-system relative to what this object represents and returns it as a string. If WANT_TRAILER is true then the string has a path delimiter appended; otherwise, there is none.

path_element( INDEX[, NEW_VALUE] )

This method is an accessor for individual segments of the "path" property of this object, and it returns the one at INDEX. If NEW_VALUE is defined then the segment at INDEX is set to it. This method is useful if you want to examine virtual path segments one at a time. INDEX defaults to 0, meaning you are looking at the first segment, which happens to always be empty. That said, this method will let you change this condition if you want to.

current_path_level([ NEW_VALUE ])

This method is an accessor for the number "current path level" property of this object, which it returns. If NEW_VALUE is defined, this property is set to it. If you want to examine the virtual path segments sequentially then this property tracks the index of the segment you are currently viewing. This property defaults to 0, the first segment, which always happens to be an empty string.

inc_path_level([ NEW_VALUE ])

This method will increment this object's "current path level" property by 1 so you can view the next path segment. The new current value is returned.

dec_path_level([ NEW_VALUE ])

This method will decrement this object's "current path level" property by 1 so you can view the previous path segment. The new current value is returned.

current_path_element([ NEW_VALUE ])

This method is an accessor for individual segments of the "path" property of this object, the current one of which it returns. If NEW_VALUE is defined then the current segment is set to it. This method is useful if you want to examine virtual path segments one at a time in sequence. The segment you are looking at now is determined by the current_path_level() method; by default you are looking at the first segment, which is always an empty string. That said, this method will let you change this condition if you want to.

AUTHOR

Copyright (c) 1999-2003, Darren R. Duncan. All rights reserved. This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself. However, I do request that this copyright information and credits remain attached to the file. If you modify this module and redistribute a changed version then please attach a note listing the modifications. This module is available "as-is" and the author can not be held accountable for any problems resulting from its use.

I am always interested in knowing how my work helps others, so if you put this module to use in any of your own products or services then I would appreciate (but not require) it if you send me the website url for said product or service, so I know who you are. Also, if you make non-proprietary changes to the module because it doesn't work the way you need, and you are willing to make these freely available, then please send me a copy so that I can roll desirable changes into the main release.

Address comments, suggestions, and bug reports to perl@DarrenDuncan.net.

CREDITS

Thanks to Baldvin Kovacs <baldvin@fazekas.hu> for alerting me to the "uninitialized value" warnings (and offering a patch to fix it) that appear when running the test suite with the -w option (fixed in 1.01), and also thanks for a patch to the README file documentation, which was applied.

SEE ALSO

perl(1), CGI::Portable.