The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Metadata::ByInode - Extend metadata in relation to file's inode using a database.

SYNOPSIS

        use Metadata::ByInode;
        
        my $mbi = new Metadata::ByInode({ abs_dbfile => '/home/myself/mbi.db' });
        
        # index files for quick lookup
        $mbi->index('/home/myself/photos/family');

        # lookup a file by filename and location
        my $results = 
                $mbi->search({ 
                        abs_loc => '/home/myself/photos/family', 
                        filename => 'ralph' 
                });

DESCRIPTION

This is primarily meant to be support for an indexer. Ideally, this will look at a slice of the filesystem, make some deductions with the indexer, and save that info. You can use this module bare bones to set and get data on any files in the system.

The indexer is a module that inherits this one.

SEE ALSO

Metadata::ByInode::Indexer

METHODS

new()

Arguments are:

dbh

(optional) existing database handle, otherwise DBD::Sqlite is used

abs_dbfile

(optional, required if you don't pass an open dbh) absoute path to sqlite file, will be created if not found.

Example usage:

        my $mbi = new Metadata::ByInode;
        
        my $mbi = new Metadata::ByInode({
                abs_dbfile => '/home/myself/mystuff.db'         
        });

NOTE ON dbh

If you do not pass a dbh, the dbh is opened using DBI::SQLite at abs_path argument. It will take care of commit and disconnect for you.

If you *do* pass it a dbh, we do not automatically commit and disconnect on DESTROY. It is up to you what to do with it, if you set autocommit or need to commit later.

_finish_open_handles()

Will search the prepared handles we opened and finish them and commit. It returns the number of prepared handles closed.

_setup_db()

automatically called if using sqlite on a non existent file, and we just created it. The table is :

        CREATE TABLE IF NOT EXISTS metadata (
                inode INTEGER(10) NOT NULL,
                mkey VARCHAR(50) NOT NULL,
                mvalue TEXT,
                PRIMARY KEY (inode,mkey)
        );

in previous version, mkey was 'key', but this caused problems in mysql

_reset_db()

will reset the table, drop and recreate metadata table.

dbh()

Returns open db handle. If you did not pass an open database handle to the constructor, it expects that you did pass an absolute path to where you want an sqlite database file read. If it does not exist, it will be made and setup.

GET AND SET METHODS

There is distinguising difference between the get() and the set() methods. The get() methods simply query the database. You can get metadata for a file that is no longer on disk.

The set() methods however, do NOT let you set metadata for a file that is not on disk. This is on purpose. So if you use this for some kind of logging, you can get history.

Again:

You can get() metadata for files no longer on disk. You can NOT set() metadata for files not on disk.

If you are using the default indexer in this distribution, files no longer on disk are automatically take out of the metadata database if they are not there any more.

set()

Sets meta for a file. First argument is abs_path or inode. Second argument is hash ref.

        $idx->set('/path/to/what',{ client => 'joe' });
        $idx->set(1235,{ client => 'hey', size => 'medium' });
        

get()

First argument is inode number, or absolute path to file.

If no metadata *is* found, returns undef.

        $mbi->get('/path/to/file','description');
        $mbi->get(1235,'description');

If value is 0, returns 0

get_all()

Returns hash with all metadata for one file. First argument is abs_path or inode.

        my $meta = $idx->get_all('/path/to/this');

        my $meta = $idx->get_all(1245);

Please note: get() methods do NOT check for file existence, they just query the database for information.

NOTE ABOUT get() AND set()

get() methods do NOT test for file existence on disk! They just try to fetch the data from the database.

however, if you use a set() method and you file definition is not inode, that is, if you try to set() metadata and you specify an absolute path, then we DO test for file existence.

You cannot set() metadata for files that are not on disk

You *can* query for metadata for files that are NOT on disk.

INTERNAL METHODS

_search_inode()

To get the inode from database.

argument is absolute path. will look up in the database to see if we can resolve to an inode.

If the path provided does not match up with our entries, returns undef. This would mean no metadata matches this path.

If argument provided is all digits, assumes this *is* an inode and returns it.

Croaks if its not ann inode or we cant split argument into an absolute path and filename.

_get_inode()

To get the inode from disk.

Takes argument and tries to return inode. Argument can be absolute file path. If argument is an inode, returns same value. If argument is word chars, tries to stat for inode. Returns undef if absolute path not on disk.

DESTROY() METHODS

The destructor will close open db handles, and commit changes. If the dbh was passed to the constructor, this will not happen and it is up to you to deal with your database settings (autocommit etc).

CAVEATS

All paths are resolved for symlinks, NOTE!

PROS AND CONS

PROS

Inode is very stable in a unix filesystem.

If the file is moved within the filesystem(within the same partition), the inode does not change. If you overrite the file with a copy command, the target file's inode does not change. If you rename the file, the inode does not change.

If you are indexing large ammounts of data, you can backup, and if you restore via copy, the inode does not change.

CONS

If you move the file to another filesystem (to another disk, to another partition) the inode of the file changes.

BUGS

Please contact AUTHOR.

AUTHOR

Leo Charre <leo@leocharre.com>