File::CleanupTask - Delete/Backup files on a task-based configuration
Version 0.02
use File::CleanupTask; my $cleanup = File::Cleanup->new({ conf => "/path/to/tasks_file.tasks", taskname => "TASK_LABEL_IN_TASKFILE", }); $cleanup->run();
Once run() is called, the cleanup operation 'TASK_LABEL_IN_TASKFILE' specified in tasks_file.tasks is performed.
A .tasks file is a text file in which one or more cleanup tasks are specified. Each task has a label and a list of options specified as shown in the following example:
[TASK_LABEL_IN_TASKFILE] path = '/home/savio/results/' backup_path = '/home/savio/old_results/' backup_gzip = 1 max_days = 3 recursive = 1 prune_empty_directories = 1 keep_if_linked_in = '/home/savio/results/' [ANOTHER_LABEL] path = 'C:\\this\\is\\a\\windows\\path' ...
In this case, [TASK_LABEL_IN_TASKFILE] is the name of the cleanup task to be executed.
The following options can be specified under a task label:
The path to the directory containing the files to be deleted or removed. Note that in MS Windows the backslashes of a path names should be escaped and apostrophese are strictly needed when specifying a path name (see example above).
If specified, will cause files to be moved in the specified directory instead of being deleted. If backup_path doesn't exist, it will be created. Symlinks are not backed up. The files are backed up at the toplevel of backup_path in a .gz (or .tgz, depending on backup_gzip) archive, which preserves pathnames of the archived files.
If set to "1", will gzip the files saved in backup_path. The resulting archive will preserve the pathname of the original file, and will be relative to 'path'.
For example, given the following configuration:
[LABEL] path = /path/to/cleanup/ backup_path = /path/to/backup/ backup_gzip = 1
If /path/to/cleanup/my/target/file.txt is encountered, and it's old, it will be backed up in /path/to/backup/file.txt.gz. Uncompressing file.txt.gz using /path/to/backup as current working directory will result in:
/path/to/backup/path/to/cleanup/my/target/file.txt
The number of maximum days within which the files in the cleanup directories are kept. If a file is older than the specified number of days, it is queued for deletion.
For example, max_days = 3 will delete files older than 3 days from the cleanup directory.
max_days defaults to 0 if it isn't specified, meaning that all the files are to be deleted.
If set to 0, only files within "path" can be deleted/backed up. If set to 1, files located at any level within "path" can be deleted.
If set to 1, empty directories will be deleted regardless their age.
A pathname to a directory that may contain symlinks. If specified, it will prevent deletion of files and directories within path that are symlinked in this directory, regardless their age.
This option will be ignored in MS Windows or in other operating systems that don't support symlinks.
A regular expression that defines a pattern to look for. Any pathname matching this pattern will not be erased, regardless their age. The regular expression applies to the full pathname of the file or directory.
If set to 1, immediate subfolders in path will be deleted only if all the files in it are deleted.
If specified, will apply any potential delete or backup action to the files that match the pattern. Any other file will be left untouched.
If set to 1, the symlinks inside 'path' will be deleted only if their target will be deleted. This option is disabled by default, which means that the target of symlinks within the path will not be questioned during deletion/backup, they will be just treated as regular files.
Create and configure a new File::CleanupTask object.
The object must be initialised as follows:
my $cleanup = File::Cleanup->new({ conf => "/path/to/tasks_file.tasks", taskname => 'TASK_LABEL_IN_TASKFILE', });
Given the arguments specified in the command line, processes them, creates a new File::CleanupTask object, an then calls run.
run
Options include dryrun, verbose, task and conf.
Perform the cleanup
Run a single cleanup task given its configuration and name. The name is used as a label for possible output and is an optional parameter of this method.
This will scan all files and directories in path in a depth first fashion. If a file is encountered a target action is performed based on the state of that file (file or directory, symlinked, old, empty directory...).
Accessors that will tell you if running in dryrun or verbose mode.
Builds a delete_once_empty of pathnames, each of which should be deleted only if all its files are also deleted.
Builds a never_delete list of pathnames that shouldn't be deleted at any condition.
Adds a path to the given never_delete list.
Checks if the given path is contained in the delete_once_empty
Adds a path to the given delete_once_empty.
Checks if the given path is contained in the never_delete.
Checks up the given path, and returns its absolute representation.
Plans the actions to be executed on the files in the target path according to:
- options in the configuration - the target files - the never_delete
All files in the never_delete list can't be deleted.
Given a path to a file and the task configuration options, augment the plan with actions to take on that file.
Returns the array containing one or more actions performed.
These actions are meant to be performed in reverse sequence on the given file. An empty array_ref is returned if no action is to be performed on the given file.
A returned action can be one of: delete, backup.
Resulting actions are decided according to one or more of the followings:
This method works under the assumption that the specified file or directory exists and the user has full permissions on it.
Adds the given action to the plan.
Returns 1 if the given folder is empty.
Execute a plan based on the given task options. Blacklist is passed to make sure once again that no unwanted files or directories are deleted.
Takes into account symlinks in the current plan.
The refinement is done in the following way:
1) Go through the plan, and look for symlink targets.
2) Mark any symlink with as the action of it's target if it's in the cleanup directory: keep the symlink if its target is kept, delete otherwise (broken symlinks, or pointing outside the cleanup, target is being backupped...). While deciding this, build an hashref of { symlink_parent (canonical) => symlink_path (non_canonical) }.
3) Add the symlink to the plan in the correct position. To do this, build another 'refined' plan. - go hrough the pathnames (visits parents first) in the plan, pop each item. - if the parent of a marked symlink is found, do the following: * mark it as 'delete' if the symlink is going to be deleted. or mark it as 'nothing' if the symlink is not going to be deleted. * push the parent in the refined plan. * push the symlink in the refined plan.
4) Fix the plan to have consistent state (bubble up states between pairs of directories)
Return the refined plan.
Get the parent path of a given path. This method does not access to the disk to determine the parent of the given pathname.
Given a path to a symlink and a hash reference, keep the symlink target as a key of the hash reference (canonical path), and the path to the symlink (non canonical) as the corresponding value. Because multiple symlinks can point to the same target, the value of this hashref is an arrayref of symlinks paths.
Returns true on success, or false if a path to something else than a symlink is passed to this method.
Refine a pattern passed from the configuration.
Currently applyes the following transformation: - Remove any "/" in case the user has specified a pattern in the form of /pattern/.
Savio Dimatteo, <savio at lokku.com>
<savio at lokku.com>
Please report any bugs or feature requests to bug-file-cleanuptask at rt.cpan.org, or through the web interface at http://rt.cpan.org/NoAuth/ReportBug.html?Queue=File-CleanupTask. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.
bug-file-cleanuptask at rt.cpan.org
You can find documentation for this module with the perldoc command.
perldoc File::CleanupTask
You can also look for information at:
RT: CPAN's request tracker (report bugs here)
http://rt.cpan.org/NoAuth/Bugs.html?Dist=File-CleanupTask
AnnoCPAN: Annotated CPAN documentation
http://annocpan.org/dist/File-CleanupTask
CPAN Ratings
http://cpanratings.perl.org/d/File-CleanupTask
Search CPAN
http://search.cpan.org/dist/File-CleanupTask/
Thanks Alex for devising the original format of a .tasks file and offering me the opportunity to publish this work on CPAN.
Thanks Mike for your feedback about canonical paths detection.
Thanks David for reviewing the code.
Thanks #london.pm for helping me choosing the name of this module.
Copyright 2012 Savio Dimatteo.
This program is free software; you can redistribute it and/or modify it under the terms of either: the GNU General Public License as published by the Free Software Foundation; or the Artistic License.
See http://dev.perl.org/licenses/ for more information.
To install File::CleanupTask, copy and paste the appropriate command in to your terminal.
cpanm
cpanm File::CleanupTask
CPAN shell
perl -MCPAN -e shell install File::CleanupTask
For more information on module installation, please visit the detailed CPAN module installation guide.