The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.
NAME
    DBIx::Table::TestDataGenerator - Automatic test data creation, cross
    DBMS

VERSION
    Version 0.0.1

SYNOPSIS
        use DBIx::Table::TestDataGenerator;

        my $generator = DBIx::Table::TestDataGenerator->new(
                dbh                => $dbi_database_handle,
                schema             => $schema_name,
                table              => $target_table_name,
        );

        #simple usage:
        $generator->create_testdata(
            target_size    => $target_size,
            num_random     => $num_random,
            seed           => $seed,
        );

        #extended usage handling a self-reference of the target table:
        $generator->create_testdata(
                target_size    => $target_size,
                num_random     => $num_random,
                seed           => $seed,
                max_tree_depth => $max_tree_depth,
                min_children   => $min_children,
                min_roots      => $min_roots,
        );


        #instantiation using a custom DBMS handling class
        my $generator = DBIx::Table::TestDataGenerator->new(
                dbh                => $dbi_database_handle,
                schema             => $schema_name,
                table              => $target_table_name,
                custom_probe_class => $custom_probe_class_name,
        );

DESCRIPTION
    There is often the need to create test data in database tables, e.g. to
    test database client performance. The existence of constraints on a
    table makes it non-trivial to come up with a way to add records to it.

    The current module inspects the tables' constraints and adds a desired
    number of records. The values of the fields either come from the table
    itself (possibly incremented to satisfy uniqueness constraints) or from
    tables referenced by foreign key constraints. The choice of the copied
    values is random for a number of runs the user can choose, afterwards
    the values are chosen randomly from a cache, reducing database traffic
    for performance reasons. The user can define seeds for the randomization
    to be able to reproduce a test run. One nice thing about this way to
    construct new records is that at least at first sight, the added data
    looks like real data, at least as real as the data initially present in
    the table was.

    A main goal of the module is to reduce configuration to the absolute
    minimum by automatically determining information about the target table,
    in particular its constraints. Another goal is to support as many DBMSs
    as possible. Currently Oracle, PostgreSQL and SQLite are supported,
    further DBMSs are in the work and one can add further databases or
    change the default behaviour by writing a class satisfying the role
    defined in DBIx::Table::TestDataGenerator::TableProbe.pm. NOTE: A major
    refactoring is on its way, see section FURTHER DEVELOPMENT.

    In the synopsis, an extended usage has been mentioned. This refers to
    the common case of having a self-reference on a table, i.e. a one-column
    wide foreign key of a table to itself where the referenced column
    constitutes the primary key. Such a parent-child relationship defines a
    rootless tree and when generating test data it may be useful to have
    some control over the growth of this tree. One such case is when the
    parent-child relation represents a navigation tree and a client
    application processes this structure. In this case, one would like to
    have a meaningful, balanced tree structure since this corresponds to
    real-world examples. To control tree creation the parameters
    max_tree_depth, min_children and min_roots are provided. Note that the
    nodes are being added in a depth-first manner.

SUBROUTINES/METHODS
  new
    Arguments:

    *   dbh: required DBI database handle

    *   schema: optional database schema name

    *   table: required name of the target table

    *   custom_probe_class: optional custom probe class name

    Return value:

    a new TestDataGenerator object

    Creates a new TestDataGenerator object. If the DBMS in question does not
    support the concept of a schema, the corresponding argument may be
    omitted. If a DBMS currently not supported by
    DBI::Table::TestDataGenerator is to be supported, or the behaviour of
    the current TableProbe class responsible for handling the DBMS must be
    changed, one may provide the optional custom_probe_class parameter.
    custom_probe_class being the name of a custom class impersonating the
    TableProbe role.

  dbh
    Accessor for the DBI database handle.

  schema
    Accessor for the database schema name.

  table
    Accessor for the name of the target table.

  custom_probe_class
    Accessor for the name of a custom class impersonating the TableProbe
    role.

  create_testdata
    This is the main method, it creates and adds new records to the target
    table. In case one of the arguments max_tree_depth, min_children or
    min_roots has been provided, the other two must be provided as well.

    Arguments:

    *   target_size

        The target number of rows to be reached.

    *   num_random

        The first $num_random number of records use fresh random choices for
        their values taken from tables referenced by foreign key relations
        or the target table itself. These values are stored in a cache and
        re-used for the remaining (target_size - $num_random) records. Note
        that even for the remaining records there is some randomness since
        the combination of cached values coming from columns involved in
        different constraints is random.

    *   seed

        This value must be an integer. In case it has been provided, the
        random selections done by the Perl code as well as those done by the
        database (where supported, e.g. not for SQLite) are seeded by this
        value resp. a value based on this value, e.g. PostgreSQL accepting
        only floating numbers between 0 and 1. This allows for reproducible
        test runs.

    *   max_tree_depth

        In case of a self-reference, the maximum depth at which new records
        will be inserted. The minimum value for this parameter is 2.

    *   min_children

        In case of a self-reference, the minimum number of children each
        handled parent node will get. A possible exception is the last
        handled parent node if the execution stops before $min_children
        child nodes have been added to it.

    *   min_roots

        In case of a self-reference, the minimum number of root elements
        existing after completion of the call to create_testdata. A record
        is considered to be a root element if the corresponding parent id is
        null or equal to the child id.

    Returns:

    Nothing, only called for the side-effect of adding new records to the
    target table. (This may change, see the section FURTHER DEVELOPMENT.)

INSTALLATION AND CONFIGURATION
    To install this module, run the following commands:

            perl Build.PL
            ./Build
            ./Build test
            ./Build install

    When installing from CPAN, the install tests look for the environment
    variables TDG_DSN (connection string), TDG_USER (user), TDG_PWD
    (password) and TDG_SCHEMA (schema) which may be used to test the
    installation against an existing database. If TDG_DSN is found, the
    install will try to use this connection string and the tests will fail
    if no valid database connection can be established. If TDG_DSN is not
    found, the installation creates an in-memory SQLite database provided
    for free by the DBD::SQLite module and tests against this database.

DATABASE VERSIONS TESTED AGAINST
    *   SQLite 3.7.14.1

    *   Oracle 11g XE

    *   PostgreSQL 9.2.1

LIMITATIONS
    *   Currently, the module executes the inserts in one big transaction if
        the database handle has not set AutoCommit to true, but this will
        change, see the section FURTHER DEVELOPMENT.

    *   Only uniqueness and foreign key constraints are taken into account.
        Constraints such as check constraints, which are very diverse and
        database specific, are not handled (and most probably will not be).

    *   Uniqueness constraints involving only columns which the DBMS
        specific TableProbe role handler does not know how to increment
        cannot be handled. Typically, all string and numeric data types are
        supported and the set of supported data types is defined by the list
        provided by the TableProbe role method
        get_type_preference_for_incrementing(). I am thinking about allowing
        date incrementation, too, it would be necessary then to at least add
        a configuration parameter defining what time incrementation step to
        use.

    *   When calling create_testdata, max_tree_depth = 1 should be allowed,
        too, meaning that all new records will be root records.

    *   Added records that are root node with respect to the self-reference
        always have the parent id equal to their pkey. It may be that in the
        case in question the convention is such that root nodes are
        identified by having the parent id set to NULL.

FURTHER DEVELOPMENT
    *   A major refactoring planned to be released with version 0.003 is in
        the works where I want to remove database specific handling with the
        help of DBIx::Class. Even if some DBMS specifics are left, this will
        help to support a broad range of DBMSs and the matureness of
        DBIx::Class will certainly help to keep the number of bugs low.

    *   The current version handles uniqueness constraints by picking out a
        column involved in the constraint and incrementing it appropriately.
        While one may do something different in a custom TableProbe class
        than incrementing and even if the values are being incremented, the
        calculation of the increment may be different, one is constrained to
        handling the single selected column.

    *   Support for transactions and specifying transaction sizes will be
        added.

    *   It will be possible to get the SQL source of all generated inserts
        without having them executed on the database.

ACKNOWLEDGEMENTS
    *   Version 0.001:

        A big thank you to all perl coders on the dbi-dev, DBIx-Class and
        perl-modules mailing lists and on PerlMonks who have patiently
        answered my questions and offered solutions, advice and
        encouragement, the Perl community is really outstanding.

        Special thanks go to Tim Bunce (module name / advice on keeping the
        module extensible), Jonathan Leffler (module naming discussion /
        relation to existing modules / multiple suggestions for features),
        brian d foy (module naming discussion / mailing lists /
        encouragement) and the following Perl monks (see the threads for
        user jds17 for details): chromatic, erix, technojosh, kejohm,
        Khen1950fx, salva, tobyink (3 of 4 discussion threads!), Your
        Mother.

    *   Version 0.002:

        Martin J. Evans was the first developer giving me feedback and nice
        bug reports on Version 0.001, thanks a lot!

AUTHOR
    Jose Diaz Seng, "<josediazseng at gmx.de>"

BUGS
    Please report any bugs or feature requests to
    "bug-dbix-table-testdatagenerator at rt.cpan.org", or through the web
    interface at
    <http://rt.cpan.org/NoAuth/ReportBug.html?Queue=DBIx-Table-TestDataGener
    ator>. I will be notified, and then you'll automatically be notified of
    progress on your bug as I make changes.

SUPPORT
    You can find documentation for this module with the perldoc command.

        perldoc DBIx::Table::TestDataGenerator

    You can also look for information at:

    *   RT: CPAN's request tracker (report bugs here)

        <http://rt.cpan.org/NoAuth/Bugs.html?Dist=DBIx-Table-TestDataGenerat
        or>

    *   AnnoCPAN: Annotated CPAN documentation

        <http://annocpan.org/dist/DBIx-Table-TestDataGenerator>

    *   CPAN Ratings

        <http://cpanratings.perl.org/d/DBIx-Table-TestDataGenerator>

    *   Search CPAN

        <http://search.cpan.org/dist/DBIx-Table-TestDataGenerator/>

LICENSE AND COPYRIGHT
    Copyright 2012 Jose Diaz Seng.

    This program is free software; you can redistribute it and/or modify it
    under the terms of either: the GNU General Public License as published
    by the Free Software Foundation; or the Artistic License.

    See http://dev.perl.org/licenses/ for more information.