hashl - Create database with partial file hashes, check if other files are in it
hashl [-fn] [-d dbfile] [-s read-size] action [args]
This manual documents hashl version 1.00
Actions:
Copy all files in the current directory which are not in the database to newdir.
List all files which are already in the database. Scans either the current directory or directory.
List all files which are not in the database. Scans either the current directory or directory.
Add all files in directory (or the current directory) as "ignored" to the database. This means that hashl will save the file's hash and skip matching files for copy or find-new.
Show information on file (or the database, if file is not specified).
List all files and their hashes. The list format is hash size file.
hash size file
If regex (a perl regular expression) is specifed, only matching files will be listed.
List all filenames, one file per line.
List ignored hashes.
Update or create hash database. Iterates over all files below the current directory.
Use dbfile instead of .hashl.db
For use with hashl add: If there are ignored files in the directory, unignore and add them.
hashl add
Do not show progress information. Most useful with hashl find-new.
hashl find-new
Change size of the part of each file which is hashed. By default, hashl hashes the first 4 MiB. Note that this option only makes sense when using hashl update to create a new database.
hashl update
Print version information.
Unless an error occured, hashl will always return zero.
None, so far
Digest::SHA
Time::Progress
Unknown. This is beta software.
First, create a database of your local files:
cd /media/videos; hashl update
Now, assume you have a (possibly slow) external share mounted at /tmp/mnt/ext. You do not want to copy all files to your disk and then use fdupes or similar to weed out the duplicates. Since you just used hashl to create a database with the hashes of the first 4MB of all your files, you can now use it to check if you (very probably) already have any remote file. For that, you only need to leech the first 4MB of every file on the share, and not the whole file. For example:
cd /tmp/mnt/ext; hashl copy /media/videos/incoming
Personally, I have all my videos on an external hard disk, which I usually do not carry with me. So, when I get new videos, I put them into ~/lib/videos on my netboo, and then later copy them to the external disk. Of course, it can always happen that I get a movie I already have, or forget to move something from ~/lib/videos to the external disk, especially since I also always have some stuff from the disk in ~/lib/videos.
However, I can use hashl to conveniently solve this issue. Run periodically:
cd /media/argon; hashl -d ~/lib/video/.argon update
Now, I always have a list of files on the external disk with me. When I get a new file:
hashl -d ~/lib/video/.argon new-file $file
And to find out which files are not on the external disk:
cd ~/lib/video; print -l **/*(.) | hashl -d .argon new-file
Copyright (C) 2010 by Daniel Friesel <derf@finalrewind.org>
0. You just DO WHAT THE FUCK YOU WANT TO.
To install App::Hashl, copy and paste the appropriate command in to your terminal.
cpanm
cpanm App::Hashl
CPAN shell
perl -MCPAN -e shell install App::Hashl
For more information on module installation, please visit the detailed CPAN module installation guide.