The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

DataStore::CAS::Simple - Simple file/directory based CAS implementation

VERSION

version 0.0200

DESCRIPTION

This implementation of DataStore::CAS uses a directory tree where the filenames are the hexadecimal value of the digest hashes. The files are placed into directories named with a prefix of the digest hash to prevent too many entries in the same directory (which is actually only a concern on certain filesystems).

Opening a File returns a real perl filehandle, and copying a File object from one instance to another is optimized by hard-linking the underlying file.

  # This is particularly fast:
  $cas1= DataStore::CAS::Simple( path => 'foo' );
  $cas2= DataStore::CAS::Simple( path => 'bar' );
  $cas1->put( $cas2->get( $hash ) );

This class does not perform any sort of optimization on the storage of the content, neither by combining commom sections of files nor by running common compression algorithms on the data.

TODO: write DataStore::CAS::Compressor or DataStore::CAS::Splitter for those features.

ATTRIBUTES

path

Read-only. The filesystem path where the store is rooted.

digest

Read-only. Algorithm used to calculate the hash values. This can only be set in the constructor when a new store is being created. Default is SHA-1.

fanout

Read-only. Returns arrayref of pattern used to split digest hashes into directories. Each digit represents a number of characters from the front of the hash which then become a directory name.

For example, [ 2, 2 ] would turn a hash of "1234567890" into a path of "12/34/567890".

fanout_list

Convenience accessor for @{ $cas->fanout }

copy_buffer_size

Number of bytes to copy at a time when saving data from a filehandle to the CAS. This is a performance hint, and the default is usually fine.

storage_format_version

Hashref of version information about the modules that created the store. Newer library versions can determine whether the storage is using an old format using this information.

_fanout_regex

Read-only. A regex-ref which splits a digest hash into the parts needed for the path name. A fanout of [ 2, 2 ] creates a regex of /(.{2})(.{2})(.*)/.

METHODS

new

  $class->new( \%params | %params )

Constructor. It will load (and possibly create) a CAS Store.

If create is specified, and path refers to an empty directory, a fresh store will be initialized. If create is specified and the directory is already a valid CAS, create is ignored, as well as digest and fanout.

path points to the cas directory. Trailing slashes don't matter. You might want to use an absolute path in case you chdir later.

copy_buffer_size initializes the respective attribute.

The digest and fanout attributes can only be initialized if the store is being created. Otherwise, it is loaded from the store's configuration.

ignore_version allows you to load a Store even if it was created with a newer version of the DataStore::CAS::Simple package that you are now using. (or a different package entirely)

create_store

  $class->create_store( %configuration | \%configuration )

Create a new store at a specified path. Configuration must include path, and may include digest and fanout. path must be an empty writeable directory, and it must exist. digest currently defaults to SHA-1. fanout currently defaults to [1, 2], resulting in paths like "a/bc/defg".

This method can be called on classes or instances.

You may also specify create => 1 in the constructor to implicitly call this method using the relevant parameters you supplied to the constructor.

get

See "get" in DataStore::CAS for details.

new_write_handle

See "new_write_handle" in DataStore::CAS for details.

commit_write_handle

See "commit_write_handle" in DataStore::CAS for details.

put

See "put" in DataStore::CAS for details.

put_scalar

See "put_scalar" in DataStore::CAS for details.

put_file

See "put_file" in DataStore::CAS for details. In particular, heed the warnings about using the 'hardlink' and 'reuse_hash' flag.

DataStore::CAS::Simple has special support for the flag 'hardlink'. If your source is a real file, or instance of DataStore::CAS::File from another DataStore::CAS::Simple, { hardlink => 1 } will link to the file instead of copying it.

validate

See "validate" in DataStore::CAS for details.

open_file

See "open_file" in DataStore::CAS for details.

iterator

See "iterator" in DataStore::CAS for details.

delete

See "delete" in DataStore::CAS for details.

FILE OBJECTS

File objects returned by DataStore::CAS::Simple have two additional attributes:

local_file

The filename of the disk file within DataStore::CAS::Simple's path which holds the requested data.

block_size

The block_size parameter from stat(), which might be useful for accessing the file efficiently.

AUTHOR

Michael Conrad <mconrad@intellitree.com>

COPYRIGHT AND LICENSE

This software is copyright (c) 2013 by Michael Conrad, and IntelliTree Solutions llc.

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.