The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

File::FindByRegex - Wrapper for File::Find that finds a directory tree and runs some action for each file whose name matchs a regex.

SYNOPSYS

   use File::FindByRegex;

   $find = File::FindByRegex->new( {

            -srcdir => ['C:\tmp\teradata-sql'], 
            -tardir => 'C:\tmp\teradata-sql\doc', 
            -find => {no_chdir => 1}, 

            -callbacks => 
            { 
                qr/\.p(l|m|od|t)$/oi,             => \&treat_pod,
                qr/\\sql\\.+?\.sql$/oi,           => 'treat_pod',
                qr/\.html?$/oi,                   => \&treat_html,
                qr/\.txt$/oi                      => \&treat_txt,
                qr/\.(jpg|gif|png|bmp|tiff)$/     => sub { &treat_graphic(@_) }
            },

            -ignore => 
            [
               qr/eg\\.+\.sql$/oi,  # *.sql in directory eg
               qr/java\\/oi,        # All files in java directory.
            ],
  
            -excepts   => 
            [ 
               qr/java\\.*?\.html?$/oi   # don't ignore *.html in java/
            ]          
   });

   sub File::FindByRegex::treat_pod
   {
       my $this = shift;  
       ...
   }

   sub File::FindByRegex::treat_html
   {
       my $this = shift;  
       ...
   }

   sub File::FindByRegex::treat_txt
   {
       my $this = shift;  
       ...
   }

   sub File::FindByRegex::treat_graphic
   {
       my $this = shift;  
       ...
   }

   $find->travel_tree;

DESCRIPTION

This is an OO module wrapper for File::Find that adds the functionality of executing some action if absolute pathname of visited file matchs a regex.

Functions:

$find_obj = File::FindByRegex->new( ... )

Returns a File::FindByRegex object (a bessed hash reference). Accepts a hash or a hash reference as argument. If argument is a hash ref., it must be the only argument.

In both cases, keys of hash argument must be:

-srcdir => [...]

Mandatory. List of absolute paths to directories. Finds each directory specified in array.

-tardir => 'target_directory'

Mandatory. Target directory for actions. Specified with absolute path.

-find => {...}

Optional. Arguments for File::Find. See documentation of File::Find.

-callbacks => {...}

Optional. Regular expressions (keys) and actions to be executed (values). Each key is a regular expression whose value is a function reference or a function name (string).

All functions specified as values must accept a File::FindByRegex object as first (and only) argument (they must be class methods).

-ignore => [...]

Optional. List of regular expressions matching files to be ignored.

-excepts => [...]

Optional. List of regular expressions that are exceptions to -ignore list.

Each absolute pathname of each file or dir is tested against each regular expression of -ignore list. If any is matched, its absolute pathname is tested against each regex of -excepts. If absolute pathname does not match any in -ignore or matchs any in -ignore but other regex is matched in -excepts, then the -callbacks list of regex is tested. If any is matched here, the associated action is executed.

Files and directory paths must be specified in the filesystem language provided by O.S. This means that for Win32, \ of dir separator must be pecified as \\.

In -ignore and -excepts list, regexes are tested in same order specified by array.

$find_obj->travel_tree

Finds beginning with each directory specified in -srcdir. Each file or directory full pathname is macthed against regular expressions.

Functions specified in -callbacks are executed when:

  • None is matched in -ignore, and the full pathname of file or dir. matches a key in -callbacks.

  • Full pathname of file or dir. matchs a regex in -ignore, but another is matched in -excepts, and a key is matched in -callbacks.

Otherwise, no action is called for the file or dir.

ACTIONS SPECIFIED BY -callbacks KEY.

Actions specified by -callbacks key are called in the namespace of File::FindByRegex. Suppose this code:

    package AnyPackage;

    use File::FindByRegex;

    my $f = File::FindByRegex->new( 
                   ..., 
                   -callbacks => {
                        qr/\\doc\\.+?\.pod/oi => \&any_function
                    },
                   ...
            );
    
    sub any_function { my $this = shift; ... }

When any file matchs the key in -callbacks, the File::FindByRegex does something like this:

    package File::FindByRegex;
    ...

    my $action = $this->{-callbacks}->{$re};
    
    if( ref($action) eq 'CODE' )
    {     
        &$action( $this );           
    }
    else
    {     
        eval "&$action( \$this )";
        die $@ if $@; 
    }

This produces an error because any_function isn't defined in File::FindByRegex package.

To avoid errors of this kind you have two posibilities:

  • Specify any_function in File::FindByRegex package:

        package AnyPackage;
    
        use File::FindByRegex;
    
        my $f = File::FindByRegex->new( 
                       ..., 
                       -callbacks => {
                            qr/\\doc\\.+?\.pod/oi => \&any_function
                        },
                       ...
                );
        
        sub File::FindByRegex::any_function { my $this = shift; ... }
  • Specify the package in -callbacks:

        package AnyPackage;
    
        use File::FindByRegex;
    
        my $f = File::FindByRegex->new( 
                       ..., 
                       -callbacks => {
                            qr/\\doc\\.+?\.pod/oi => \&AnyPackage::any_function
                        },
                       ...
                );
        
        sub any_function { my $this = shift; ... }

    But in this case remember that $this is a File::FindByRegex blessed reference.

OVERRIDABLE FUNCTION.

A function named post_match, of this module exists with the only purpose of being overriden. It is called unconditionally for each visited file or dir.

Its default implementation is empty, so if not overriden, nothing is done. Use it as a hook or callback in addition to -callbacks functions.

Inside post_match, one can investigate what occurred by the value of $this->{-explain}:

  • It's initialized to 0 for each visited file/dir.

  • If a regex in -ignore is matched, 1 is added.

  • If a regex in -excepts is matched, then 2 is added.

    Remember that -excepts is checked only if -ignore is matched.

  • If a regex in -callbacks is matched, 4 is added.

    Remember that -callbacks is checked only if none -ignore nor -excepts are matched or if both are matched.

So, posible values of $this->{-explain} are 0, 1, 3, 4, or 7:

   sub File::FindByRegex::post_match
   {
       my $this = shift;

     SWITCH:
     {
         $this->{-explain}==0 && do
         {
             ... nothing matched ...
             last;
         };

         $this->{-explain}==1 && do
         {
             ... matched -ignore only ...
             last;
         };

         $this->{-explain}==3 && do
         {
             ... matched -ignore and -excepts only ...
             last;
         };

         $this->{-explain}==4 && do
         {
             ... matched only -callbacks and function called ...
             last;
         };

         $this->{-explain}==7 && do
         {
             ... matched -ignore, -excepts and -callbacks, so
                 function was called ...
             last;
         };
     }
   }

Inside post_match, one can ask for $this->{-explain} to know if an action of callbacks was executed.

Sample:

    package Pkg;
    ...
    @ISA = qw( File::FindByRegex );
    ...

    sub post_match
    {
        my $this = shift;

        my $action_done = $this->{-explain} == 4 || $this->{-explain} == 7 ? 
                          1 : 0;

        if( $action_done )
        {
            # An action in -callbacks was called.
            ...
        }
        else
        {
            # No action done:  no regular expression matched.
            ...
        }
        ...
    }    

Must accept a File::FindByRegex object as first and only argument. Must be in File::FindByRegex or a derived package because is called in the context of File::FindByRegex namespace.

OBJECTS INTERNALS.

Keys and values of $this blessed hash reference are:

  • Each key/value pairs of hash passed as argument to new, are members of $this.

  • -explain with the meaning yet explained.

  • And attributes of file/dir being processed:

        -absdir   => string,  # Absolute directory being processed.
        -reldir   => string,  # Relative directory being processed.
        -abspathn => string,  # Absolute pathname (file) being procesed
        -name     => string,  # File name w/o extension being processed.
        -ext      => string,  # File extension being procesed.

SEE ALSO

File::Find.

AUTHOR AND COPYRIGHT

Enrique Castilla Contreras (ecastilla@wanadoo.es).

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.