The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

BerkeleyDB::Lite - Simplified Interface to BerkeleyDB

SYNOPSIS

  use BerkeleyDB::Lite;

## Example 1

  ## Create a Hashed database
  my $db = new BerkeleyDB::Lite::Hash
                home => 'zoo',
                filename => 'residents' ;

  $db->{Samson} = new Primate ;
  $db->{Cornelius} = new Primate ;
  $db->{Kaa} = new Reptile ;

## Example 2

  ## Create a Btree database allowing duplicates and scalar values
  my $types = scalars BerkeleyDB::Lite::Btree
                home => 'zoo',
                filename => 'types',
                &duplicatekeys ;

  $types->{primate} = 'Samson' ;
  $types->{primate} = 'Cornelius' ;
  $types->{reptile} = 'Kaa' ;

  printf "%s\n", join ' ', $types->recordset{primate} ;
  ## prints: Samson Cornelius

  $types->delete( primate => 'Samson' ) ;
  printf "%s\n", join ' ', $types->recordset{primate} ;
  ## prints: Cornelius

## Example 3

  ## Create a database of visitors
  ## Use a table with arbitrary keys
  ## Track visitors by date/timestamp

  $tickets = new BerkeleyDB::Lite::Btree
                home => 'zoo',
                filename => 'tickets',
                &incrementkeys ;

  ## Lexical Alternative
  # $tickets = lexical BerkeleyDB::Lite::Btree
  #             home => 'zoo',
  #             filename => 'tickets' ;

  $bytime = scalars BerkeleyDB::Lite::Btree
                home => 'zoo',
                filename => 'ticketsbytime',
                &duplicatekeys ;

  ## Process a new visitor in real time
  sub newvisitor {
        my $serial = $tickets->nextrecord() ;
        my $date = getdate() ;  ## not part of BerkeleyDB::Lite
        my $time = gettime() ;  ## not part of BerkeleyDB::Lite

        $tickets->{$serial} = { @_ } ;
        $bytime->{ "$date $time" } = $serial ;
        return $serial ;
        }

  ## Get a list of visitors on a certain date
  sub showvisitorsbydate {
        my $date = shift ;
        return $bytime->matchingvalues( $date ) ;
        }

DESCRIPTION

BerkeleyDB::Lite is an interface to Paul Marquess's BerkeleyDB that provides simplified constructors, tied access to data, and methods for returning multiple record sets.

Example 1

BerkeleyDB::Lite maintains BerkeleyDB environment references in a package variable hash keyed on the home argument. The basic BerkeleyDB::Lite constructor arguments define the BerkeleyDB environment and database. When the constructor is called, a previously opened environment is used if available. Otherwise, a new environment is created and is available to future constructor requests.

This version of BerkeleyDB::Lite creates all environment objects as concurrent data stores. Transactional data storage is not currently integrated.

By default, BerkeleyDB::Lite is designed to marshall objects into a database using the Storable module.

Example 1 shows a simple application that illustrates both of these features. The constructor is called with the minimum arguments to identify the environment and the database.

These few lines of code are sufficient to add persistent object support to an application.

Example 2

One of Berkeley's most appealing features is support for duplicate keys. This feature enables a programmer to use persistent arrays, where elements can be accessed, added, and deleted without marshalling.

Example 2 uses the scalars constructor which disables the automatic serialization of record access. Otherwise, if the new constructor is used, scalars will be returned as scalar references, regardless of how they are stored.

&duplicatekeys is a subroutine that returns a pair of constants as a shortcut. The constants are defined in the BerkeleyDB module.

The recordset method returns a stored list from the database. This method is available to both BerkeleyDB::Lite::Btree and BerkeleyDB::Lite::Hash classes.

The delete method is used to delete an element from the list. Since BerkeleyDB::Lite adheres to the Tie interface, the delete function can normally used to remove stored objects. The delete method should be used on databases with duplicate keys to avoid indeterminate results.

BerkeleyDB returns the status of a delete operation. This feature can be used to delete an entire list using the following idiom:

  while ( ! delete $types->{primate} ) {}

A BerkeleyDB database configured for duplicate keys also allows duplicate key/value pairs. For most one-to-many data sets, key/value pairs should be unique. This issue has not been completely resolved. Presently, the workaround is to import a retrieved list into a hash structure:

  %unique = map { $_ => 1 } $types->recordset('primate') ;
  keys %unique ;

However, care should be taken when deleting elements. The delete method for duplicate keys should almost always be invoked using an idiom similar to the one above:

  while ( ! $types->delete( primate => 'samson' ) ) {}

Another source of problems occurs when using the delete method on databases containing objects. In this case, the second argument may refer to an object that does not exactly match the stored value. The following code illustrates this difficulty:

  my $cats = new BerkeleyDB::Lite::Btree(
                home => 'zoo',
                filename => 'cats',
                &duplicatekeys,
                ) ;

  my $Felix = new BigCat dinner => 'antelope' ;
  $cats->{lion} = $Felix ;
  $Felix->{dinner} = 'gazelle' ;
  $cats->delete( lion => $Felix ) ;             ## fails

This problem also occurs because the results of the marshalling operation differ depending on whether numbers are interpreted as integers, floats, or strings. Thus an object's value may change merely as a result of its context. The following example illustrates the situation:

  $weight = '300 lbs.' ;
  $weight =~ s/\D//g ;
  my $Felix = new BigCat( weight => $weight ) ; ## member as string
  $cats->{lion} = $Felix ;
  $cats->delete( lion => $Felix )               ## operation fails
                if $Felix->{weight} > 200 ;     ## member as integer 

Example 3

Example 3 shows a few additional features helpful to developers accustomed to relational databases. These features take advantage of the Btree database capabilities, and are not available to BerkeleyDB::Lite::Hash objects.

The nextrecord method of BerkeleyDB::Lite::Btree returns a new unique key. Each nextrecord call creates a new blank record to avoid race conditions, and returns the new key. This method creates a key by adding 1 to the last record. In order to ensure that the last record contains the highest valued key, use the &incrementkeys argument to the BerkeleyDB::Lite::Btree constructor. The &incrementkeys function is a shortcut that returns a CODE constant that forces numerical Btree sorting.

There is a significant disadvantage to databases created using the &incrementkeys argument. The resulting databases are incompatible with SleepyCat utilities such as db_dump and db_verify. As an alternative, nextrecord can be called as a method from the BerkeleyDB::Lite::Btree::Lexical subclass. This subclass functions identically, but the numerical keys are stored as zero padded strings. Therefore, a restriction on Lexical subclass databases is that keys must be numerically less than 10,000,000,000.

The lexical constructor to the BerkeleyDB::Lite::Btree class is synonymous with the new constructor to the BerkeleyDB::Lite::Btree::Lexical subclass.

BerkeleyDB::Lite also implements another nice Berkeley feature: partial string matching. The methods matchingkeys, matchingvalues, and searchset all return a set of records whose keys begin with a common substring.

For example, if keys are defined with the following format: "2002 Jul 14 15:30", the following data can be returned:

  ## All records for the year
  @annually = $bytime->matchingkeys('2002 ') ;

  ## All records for the month
  @monthly = $bytime->matchingvalues('2002 Jul ') ;

  ## All records for the day
  %daily = $bytime->searchset('2002 Jul 14 ') ; 

matchingkeys returns an array of the matching records' keys. matchingvalues returns an array of the matching records' values. Unforeseen confusion may result from the method name matchingvalues- the returned records have matching keys, but the record values are returned.

searchset returns the matching records as key/value pairs that can populate an associative array as shown. However, using an associative array is pointless if the database contains duplicate keys. The following code is an effective technique for capturing the results of this type of search:

    foreach ( $bytime->matchingkeys( '2002 Jul 14', &uniquekeys ) ) {
        $daily{ $_ } = [ $bytime->recordset( $_ ) ] ;
        }

&uniquekeys returns a constant that is used primarily as an argument to the matchingkeys method to filter duplicate results from the database. When this argument is passed to the &searchset method, the values in the key/value pairs indicate a record count. &uniquekeys cannot be used with the matchingvalues method.

EXPORT

&duplicatekeys &incrementkeys &uniquepairs &uniquekeys

AUTHOR

Jim Schueler, <jschueler@tqis.com>

SEE ALSO

Storable BerkeleyDB http://www.sleepycat.com