The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Rosetta::Model - Specify all database tasks with SQL routines

VERSION

This document describes Rosetta::Model version 0.71.0.

SYNOPSIS

Perl Code That Builds A Rosetta::Model Model

This executable code example shows how to define some simple database tasks with Rosetta::Model; it only shows a tiny fraction of what the module is capable of, since more advanced features are not shown for brevity.

    use Rosetta::Model;

    eval {
        # Create a model/container in which all SQL details are to be stored.
        # The two boolean options being set true here permit all the subsequent code to be as concise,
        # easy to read, and most SQL-string-like as possible, at the cost of being slower to execute.
        my $model = Rosetta::Model->new_container();
        $model->auto_set_node_ids( 1 );
        $model->may_match_surrogate_node_ids( 1 );

        # This defines 4 scalar/column/field data types (1 number, 2 char strings, 1 enumerated value type)
        # and 2 row/table data types; the former are atomic and the latter are composite.
        # The former can describe individual columns of a base table (table) or viewed table (view),
        # while the latter can describe an entire table or view.
        # Any of these can describe a 'domain' schema object or a stored procedure's variable's data type.
        # See also the 'person' and 'person_with_parents' table+view defs further below; these data types help describe them.
        $model->build_child_node_trees( [
            [ 'scalar_data_type', { 'si_name' => 'entity_id'  , 'base_type' => 'NUM_INT' , 'num_precision' => 9, }, ],
            [ 'scalar_data_type', { 'si_name' => 'alt_id'     , 'base_type' => 'STR_CHAR', 'max_chars' => 20, 'char_enc' => 'UTF8', }, ],
            [ 'scalar_data_type', { 'si_name' => 'person_name', 'base_type' => 'STR_CHAR', 'max_chars' => 100, 'char_enc' => 'UTF8', }, ],
            [ 'scalar_data_type', { 'si_name' => 'person_sex' , 'base_type' => 'STR_CHAR', 'max_chars' => 1, 'char_enc' => 'UTF8', }, [
                [ 'scalar_data_type_opt', 'M', ],
                [ 'scalar_data_type_opt', 'F', ],
            ], ],
            [ 'row_data_type', 'person', [
                [ 'row_data_type_field', { 'si_name' => 'person_id'   , 'scalar_data_type' => 'entity_id'  , }, ],
                [ 'row_data_type_field', { 'si_name' => 'alternate_id', 'scalar_data_type' => 'alt_id'     , }, ],
                [ 'row_data_type_field', { 'si_name' => 'name'        , 'scalar_data_type' => 'person_name', }, ],
                [ 'row_data_type_field', { 'si_name' => 'sex'         , 'scalar_data_type' => 'person_sex' , }, ],
                [ 'row_data_type_field', { 'si_name' => 'father_id'   , 'scalar_data_type' => 'entity_id'  , }, ],
                [ 'row_data_type_field', { 'si_name' => 'mother_id'   , 'scalar_data_type' => 'entity_id'  , }, ],
            ], ],
            [ 'row_data_type', 'person_with_parents', [
                [ 'row_data_type_field', { 'si_name' => 'self_id'    , 'scalar_data_type' => 'entity_id'  , }, ],
                [ 'row_data_type_field', { 'si_name' => 'self_name'  , 'scalar_data_type' => 'person_name', }, ],
                [ 'row_data_type_field', { 'si_name' => 'father_id'  , 'scalar_data_type' => 'entity_id'  , }, ],
                [ 'row_data_type_field', { 'si_name' => 'father_name', 'scalar_data_type' => 'person_name', }, ],
                [ 'row_data_type_field', { 'si_name' => 'mother_id'  , 'scalar_data_type' => 'entity_id'  , }, ],
                [ 'row_data_type_field', { 'si_name' => 'mother_name', 'scalar_data_type' => 'person_name', }, ],
            ], ],
        ] );

        # This defines the blueprint of a database catalog that contains a single schema and a single virtual user which owns the schema.
        my $catalog_bp = $model->build_child_node_tree( 'catalog', 'Gene Database', [
            [ 'owner', 'Lord of the Root', ],
            [ 'schema', { 'si_name' => 'Gene Schema', 'owner' => 'Lord of the Root', }, ],
        ] );
        my $schema = $catalog_bp->find_child_node_by_surrogate_id( 'Gene Schema' );

        # This defines a base table (table) schema object that lives in the aforementioned database catalog.
        # It contains 6 columns, including a not-null primary key (having a trivial sequence generator to give it
        # default values), another not-null field, a surrogate key, and 2 self-referencing foreign keys.
        # Each row represents a single 'person', for each storing up to 2 unique identifiers, name, sex, and the parents' unique ids.
        my $tb_person = $schema->build_child_node_tree( 'table', { 'si_name' => 'person', 'row_data_type' => 'person', }, [
            [ 'table_field', { 'si_row_field' => 'person_id', 'mandatory' => 1, 'default_val' => 1, 'auto_inc' => 1, }, ],
            [ 'table_field', { 'si_row_field' => 'name'     , 'mandatory' => 1, }, ],
            [ 'table_index', { 'si_name' => 'primary' , 'index_type' => 'UNIQUE', }, [
                [ 'table_index_field', 'person_id', ],
            ], ],
            [ 'table_index', { 'si_name' => 'ak_alternate_id', 'index_type' => 'UNIQUE', }, [
                [ 'table_index_field', 'alternate_id', ],
            ], ],
            [ 'table_index', { 'si_name' => 'fk_father', 'index_type' => 'FOREIGN', 'f_table' => 'person', }, [
                [ 'table_index_field', { 'si_field' => 'father_id', 'f_field' => 'person_id' } ],
            ], ],
            [ 'table_index', { 'si_name' => 'fk_mother', 'index_type' => 'FOREIGN', 'f_table' => 'person', }, [
                [ 'table_index_field', { 'si_field' => 'mother_id', 'f_field' => 'person_id' } ],
            ], ],
        ] );

        # This defines a viewed table (view) schema object that lives in the aforementioned database catalog.
        # It left-outer-joins the 'person' table to itself twice and returns 2 columns from each constituent, for 6 total.
        # Each row gives the unique id and name each for 3 people, a given person and that person's 2 parents.
        my $vw_pwp = $schema->build_child_node_tree( 'view', { 'si_name' => 'person_with_parents',
                'view_type' => 'JOINED', 'row_data_type' => 'person_with_parents', }, [
            ( map { [ 'view_src', { 'si_name' => $_, 'match' => 'person', }, [
                map { [ 'view_src_field', $_, ], } ( 'person_id', 'name', 'father_id', 'mother_id', ),
            ], ], } ('self') ),
            ( map { [ 'view_src', { 'si_name' => $_, 'match' => 'person', }, [
                map { [ 'view_src_field', $_, ], } ( 'person_id', 'name', ),
            ], ], } ( 'father', 'mother', ) ),
            [ 'view_field', { 'si_row_field' => 'self_id'    , 'src_field' => ['person_id','self'  ], }, ],
            [ 'view_field', { 'si_row_field' => 'self_name'  , 'src_field' => ['name'     ,'self'  ], }, ],
            [ 'view_field', { 'si_row_field' => 'father_id'  , 'src_field' => ['person_id','father'], }, ],
            [ 'view_field', { 'si_row_field' => 'father_name', 'src_field' => ['name'     ,'father'], }, ],
            [ 'view_field', { 'si_row_field' => 'mother_id'  , 'src_field' => ['person_id','mother'], }, ],
            [ 'view_field', { 'si_row_field' => 'mother_name', 'src_field' => ['name'     ,'mother'], }, ],
            [ 'view_join', { 'lhs_src' => 'self', 'rhs_src' => 'father', 'join_op' => 'LEFT', }, [
                [ 'view_join_field', { 'lhs_src_field' => 'father_id', 'rhs_src_field' => 'person_id' } ],
            ], ],
            [ 'view_join', { 'lhs_src' => 'self', 'rhs_src' => 'mother', 'join_op' => 'LEFT', }, [
                [ 'view_join_field', { 'lhs_src_field' => 'mother_id', 'rhs_src_field' => 'person_id' } ],
            ], ],
        ] );

        # This defines the blueprint of an application that has a single virtual connection descriptor to the above database.
        my $application_bp = $model->build_child_node_tree( 'application', 'Gene App', [
            [ 'catalog_link', { 'si_name' => 'editor_link', 'target' => $catalog_bp, }, ],
        ] );

        # This defines another scalar data type, which is used by some routines that follow below.
        my $sdt_login_auth = $model->build_child_node( 'scalar_data_type', { 'si_name' => 'login_auth',
            'base_type' => 'STR_CHAR', 'max_chars' => 20, 'char_enc' => 'UTF8', } );

        # This defines an application-side routine/function that connects to the 'Gene Database', fetches all
        # the records from the 'person_with_parents' view, disconnects the database, and returns the fetched records.
        # It takes run-time arguments for a user login name and password that are used when connecting.
        my $rt_fetch_pwp = $application_bp->build_child_node_tree( 'routine', { 'si_name' => 'fetch_pwp',
                'routine_type' => 'FUNCTION', 'return_cont_type' => 'RW_ARY', 'return_row_data_type' => 'person_with_parents', }, [
            [ 'routine_arg', { 'si_name' => 'login_name', 'cont_type' => 'SCALAR', 'scalar_data_type' => $sdt_login_auth }, ],
            [ 'routine_arg', { 'si_name' => 'login_pass', 'cont_type' => 'SCALAR', 'scalar_data_type' => $sdt_login_auth }, ],
            [ 'routine_var', { 'si_name' => 'conn_cx', 'cont_type' => 'CONN', 'conn_link' => 'editor_link', }, ],
            [ 'routine_stmt', { 'call_sroutine' => 'CATALOG_OPEN', }, [
                [ 'routine_expr', { 'call_sroutine_cxt' => 'CONN_CX', 'cont_type' => 'CONN', 'valf_p_routine_item' => 'conn_cx', }, ],
                [ 'routine_expr', { 'call_sroutine_arg' => 'LOGIN_NAME', 'cont_type' => 'SCALAR', 'valf_p_routine_item' => 'login_name', }, ],
                [ 'routine_expr', { 'call_sroutine_arg' => 'LOGIN_PASS', 'cont_type' => 'SCALAR', 'valf_p_routine_item' => 'login_pass', }, ],
            ], ],
            [ 'routine_var', { 'si_name' => 'pwp_ary', 'cont_type' => 'RW_ARY', 'row_data_type' => 'person_with_parents', }, ],
            [ 'routine_stmt', { 'call_sroutine' => 'SELECT', }, [
                [ 'view', { 'si_name' => 'query_pwp', 'view_type' => 'ALIAS', 'row_data_type' => 'person_with_parents', }, [
                    [ 'view_src', { 'si_name' => 's', 'match' => $vw_pwp, }, ],
                ], ],
                [ 'routine_expr', { 'call_sroutine_cxt' => 'CONN_CX', 'cont_type' => 'CONN', 'valf_p_routine_item' => 'conn_cx', }, ],
                [ 'routine_expr', { 'call_sroutine_arg' => 'SELECT_DEFN', 'cont_type' => 'ROS_M_NODE', 'act_on' => 'query_pwp', }, ],
                [ 'routine_expr', { 'call_sroutine_arg' => 'INTO', 'query_dest' => 'pwp_ary', 'cont_type' => 'RW_ARY', }, ],
            ], ],
            [ 'routine_stmt', { 'call_sroutine' => 'CATALOG_CLOSE', }, [
                [ 'routine_expr', { 'call_sroutine_cxt' => 'CONN_CX', 'cont_type' => 'CONN', 'valf_p_routine_item', 'conn_cx', }, ],
            ], ],
            [ 'routine_stmt', { 'call_sroutine' => 'RETURN', }, [
                [ 'routine_expr', { 'call_sroutine_arg' => 'RETURN_VALUE', 'cont_type' => 'RW_ARY', 'valf_p_routine_item' => 'pwp_ary', }, ],
            ], ],
        ] );

        # This defines an application-side routine/procedure that inserts a set of records, given in an argument,
        # into the 'person' table.  It takes an already opened db connection handle to operate through as a
        # 'context' argument (which would represent the invocant if this routine was wrapped in an object-oriented interface).
        my $rt_add_people = $application_bp->build_child_node_tree( 'routine', { 'si_name' => 'add_people', 'routine_type' => 'PROCEDURE', }, [
            [ 'routine_context', { 'si_name' => 'conn_cx', 'cont_type' => 'CONN', 'conn_link' => 'editor_link', }, ],
            [ 'routine_arg', { 'si_name' => 'person_ary', 'cont_type' => 'RW_ARY', 'row_data_type' => 'person', }, ],
            [ 'routine_stmt', { 'call_sroutine' => 'INSERT', }, [
                [ 'view', { 'si_name' => 'insert_people', 'view_type' => 'INSERT', 'row_data_type' => 'person', 'ins_p_routine_item' => 'person_ary', }, [
                    [ 'view_src', { 'si_name' => 's', 'match' => $tb_person, }, ],
                ], ],
                [ 'routine_expr', { 'call_sroutine_cxt' => 'CONN_CX', 'cont_type' => 'CONN', 'valf_p_routine_item' => 'conn_cx', }, ],
                [ 'routine_expr', { 'call_sroutine_arg' => 'INSERT_DEFN', 'cont_type' => 'ROS_M_NODE', 'act_on' => 'insert_people', }, ],
            ], ],
        ] );

        # This defines an application-side routine/function that fetches one record
        # from the 'person' table which matches its argument.
        my $rt_get_person = $application_bp->build_child_node_tree( 'routine', { 'si_name' => 'get_person',
                'routine_type' => 'FUNCTION', 'return_cont_type' => 'ROW', 'return_row_data_type' => 'person', }, [
            [ 'routine_context', { 'si_name' => 'conn_cx', 'cont_type' => 'CONN', 'conn_link' => 'editor_link', }, ],
            [ 'routine_arg', { 'si_name' => 'arg_person_id', 'cont_type' => 'SCALAR', 'scalar_data_type' => 'entity_id', }, ],
            [ 'routine_var', { 'si_name' => 'person_row', 'cont_type' => 'ROW', 'row_data_type' => 'person', }, ],
            [ 'routine_stmt', { 'call_sroutine' => 'SELECT', }, [
                [ 'view', { 'si_name' => 'query_person', 'view_type' => 'JOINED', 'row_data_type' => 'person', }, [
                    [ 'view_src', { 'si_name' => 's', 'match' => $tb_person, }, [
                        [ 'view_src_field', 'person_id', ],
                    ], ],
                    [ 'view_expr', { 'view_part' => 'WHERE', 'cont_type' => 'SCALAR', 'valf_call_sroutine' => 'EQ', }, [
                        [ 'view_expr', { 'call_sroutine_arg' => 'LHS', 'cont_type' => 'SCALAR', 'valf_src_field' => 'person_id', }, ],
                        [ 'view_expr', { 'call_sroutine_arg' => 'RHS', 'cont_type' => 'SCALAR', 'valf_p_routine_item' => 'arg_person_id', }, ],
                    ], ],
                ], ],
                [ 'routine_expr', { 'call_sroutine_cxt' => 'CONN_CX', 'cont_type' => 'CONN', 'valf_p_routine_item' => 'conn_cx', }, ],
                [ 'routine_expr', { 'call_sroutine_arg' => 'SELECT_DEFN', 'cont_type' => 'ROS_M_NODE', 'act_on' => 'query_person', }, ],
                [ 'routine_expr', { 'call_sroutine_arg' => 'INTO', 'query_dest' => 'person_row', 'cont_type' => 'RW_ARY', }, ],
            ], ],
            [ 'routine_stmt', { 'call_sroutine' => 'RETURN', }, [
                [ 'routine_expr', { 'call_sroutine_arg' => 'RETURN_VALUE', 'cont_type' => 'ROW', 'valf_p_routine_item' => 'person_row', }, ],
            ], ],
        ] );

        # This defines 6 database engine descriptors and 2 database bridge descriptors that we may be using.
        # These details can help external code determine such things as what string-SQL flavors should be
        # generated from the model, as well as which database features can be used natively or have to be emulated.
        # The 'si_name' has no meaning to code and is for users; the other attribute values should have meaning to said external code.
        $model->build_child_node_trees( [
            [ 'data_storage_product', { 'si_name' => 'SQLite v3.2'  , 'product_code' => 'SQLite_3_2'  , 'is_file_based'  => 1, }, ],
            [ 'data_storage_product', { 'si_name' => 'MySQL v5.0'   , 'product_code' => 'MySQL_5_0'   , 'is_network_svc' => 1, }, ],
            [ 'data_storage_product', { 'si_name' => 'PostgreSQL v8', 'product_code' => 'PostgreSQL_8', 'is_network_svc' => 1, }, ],
            [ 'data_storage_product', { 'si_name' => 'Oracle v10g'  , 'product_code' => 'Oracle_10_g' , 'is_network_svc' => 1, }, ],
            [ 'data_storage_product', { 'si_name' => 'Sybase'       , 'product_code' => 'Sybase'      , 'is_network_svc' => 1, }, ],
            [ 'data_storage_product', { 'si_name' => 'CSV'          , 'product_code' => 'CSV'         , 'is_file_based'  => 1, }, ],
            [ 'data_link_product', { 'si_name' => 'Microsoft ODBC v3', 'product_code' => 'ODBC_3', }, ],
            [ 'data_link_product', { 'si_name' => 'Oracle OCI*8', 'product_code' => 'OCI_8', }, ],
            [ 'data_link_product', { 'si_name' => 'Generic Rosetta Engine', 'product_code' => 'Rosetta::Engine::Generic', }, ],
        ] );

        # This defines one concrete instance each of the database catalog and an application using it.
        # This concrete database instance includes two concrete user definitions, one that can owns
        # the schema and one that can only edit data.  The concrete application instance includes
        # a concrete connection descriptor going to this concrete database instance.
        # Note that 'user' descriptions are only stored in a Rosetta::Model model when that model is being used to create
        # database catalogs and/or create or modify database users; otherwise 'user' should not be kept for security sake.
        $model->build_child_node_trees( [
            [ 'catalog_instance', { 'si_name' => 'test', 'blueprint' => $catalog_bp, 'product' => 'PostgreSQL v8', }, [
                [ 'user', { 'si_name' => 'ronsealy', 'user_type' => 'SCHEMA_OWNER', 'match_owner' => 'Lord of the Root', 'password' => 'K34dsD', }, ],
                [ 'user', { 'si_name' => 'joesmith', 'user_type' => 'DATA_EDITOR', 'password' => 'fdsKJ4', }, ],
            ], ],
            [ 'application_instance', { 'si_name' => 'test app', 'blueprint' => $application_bp, }, [
                [ 'catalog_link_instance', { 'blueprint' => 'editor_link', 'product' => 'Microsoft ODBC v3', 'target' => 'test', 'local_dsn' => 'keep_it', }, ],
            ], ],
        ] );

        # This defines another concrete instance each of the database catalog and an application using it.
        $model->build_child_node_trees( [
            [ 'catalog_instance', { 'si_name' => 'production', 'blueprint' => $catalog_bp, 'product' => 'Oracle v10g', }, [
                [ 'user', { 'si_name' => 'florence', 'user_type' => 'SCHEMA_OWNER', 'match_owner' => 'Lord of the Root', 'password' => '0sfs8G', }, ],
                [ 'user', { 'si_name' => 'thainuff', 'user_type' => 'DATA_EDITOR', 'password' => '9340sd', }, ],
            ], ],
            [ 'application_instance', { 'si_name' => 'production app', 'blueprint' => $application_bp, }, [
                [ 'catalog_link_instance', { 'blueprint' => 'editor_link', 'product' => 'Oracle OCI*8', 'target' => 'production', 'local_dsn' => 'ship_it', }, ],
            ], ],
        ] );

        # This defines a third concrete instance each of the database catalog and an application using it.
        $model->build_child_node_trees( [
            [ 'catalog_instance', { 'si_name' => 'laptop demo', 'blueprint' => $catalog_bp, 'product' => 'SQLite v3.2', 'file_path' => 'Move It', }, ],
            [ 'application_instance', { 'si_name' => 'laptop demo app', 'blueprint' => $application_bp, }, [
                [ 'catalog_link_instance', { 'blueprint' => 'editor_link', 'product' => 'Generic Rosetta Engine', 'target' => 'laptop demo', }, ],
            ], ],
        ] );

        # This line will run some correctness tests on the model that were not done
        # when the model was being populated for execution speed efficiency.
        $model->assert_deferrable_constraints();

        # This line will dump the contents of the model in pretty-printed XML format.
        # It can be helpful when debugging your programs that use Rosetta::Model.
        print $model->get_all_properties_as_xml_str( 1 );
    };
    $@ and print error_to_string($@);

    # Rosetta::Model throws object exceptions when it encounters bad input; this function
    # will convert those into human readable text for display by the try/catch block.
    sub error_to_string {
        my ($message) = @_;
        if (ref $message and UNIVERSAL::isa( $message, 'Locale::KeyedText::Message' )) {
            my $translator = Locale::KeyedText->new_translator( ['Rosetta::Model::L::'], ['en'] );
            my $user_text = $translator->translate_message( $message );
            return q{internal error: can't find user text for a message: }
                . $message->as_string() . ' ' . $translator->as_string();
                if !$user_text;
            return $user_text;
        }
        return $message; # if this isn't the right kind of object
    }

Note that one key feature of Rosetta::Model is that all of a model's pieces are linked by references rather than by name as in SQL itself. For example, the name of the 'person' table is only stored once internally; if, after executing all of the above code, you were to run "$tb_person->set_attribute( 'si_name', 'The Huddled Masses' );", then all of the other parts of the model that referred to the table would not break, and an XML dump would show that all the references now say 'The Huddled Masses'.

For some more (older) examples of Rosetta::Model in use, see its test suite code.

An XML Representation of That Model

This is the XML that the above get_all_properties_as_xml_str() prints out:

    <?xml version="1.0" encoding="UTF-8"?>
    <root>
        <elements>
            <scalar_data_type id="1" si_name="entity_id" base_type="NUM_INT" num_precision="9" />
            <scalar_data_type id="2" si_name="alt_id" base_type="STR_CHAR" max_chars="20" char_enc="UTF8" />
            <scalar_data_type id="3" si_name="person_name" base_type="STR_CHAR" max_chars="100" char_enc="UTF8" />
            <scalar_data_type id="4" si_name="person_sex" base_type="STR_CHAR" max_chars="1" char_enc="UTF8">
                <scalar_data_type_opt id="5" si_value="M" />
                <scalar_data_type_opt id="6" si_value="F" />
            </scalar_data_type>
            <row_data_type id="7" si_name="person">
                <row_data_type_field id="8" si_name="person_id" scalar_data_type="entity_id" />
                <row_data_type_field id="9" si_name="alternate_id" scalar_data_type="alt_id" />
                <row_data_type_field id="10" si_name="name" scalar_data_type="person_name" />
                <row_data_type_field id="11" si_name="sex" scalar_data_type="person_sex" />
                <row_data_type_field id="12" si_name="father_id" scalar_data_type="entity_id" />
                <row_data_type_field id="13" si_name="mother_id" scalar_data_type="entity_id" />
            </row_data_type>
            <row_data_type id="14" si_name="person_with_parents">
                <row_data_type_field id="15" si_name="self_id" scalar_data_type="entity_id" />
                <row_data_type_field id="16" si_name="self_name" scalar_data_type="person_name" />
                <row_data_type_field id="17" si_name="father_id" scalar_data_type="entity_id" />
                <row_data_type_field id="18" si_name="father_name" scalar_data_type="person_name" />
                <row_data_type_field id="19" si_name="mother_id" scalar_data_type="entity_id" />
                <row_data_type_field id="20" si_name="mother_name" scalar_data_type="person_name" />
            </row_data_type>
            <scalar_data_type id="59" si_name="login_auth" base_type="STR_CHAR" max_chars="20" char_enc="UTF8" />
        </elements>
        <blueprints>
            <catalog id="21" si_name="Gene Database">
                <owner id="22" si_name="Lord of the Root" />
                <schema id="23" si_name="Gene Schema" owner="Lord of the Root">
                    <table id="24" si_name="person" row_data_type="person">
                        <table_field id="25" si_row_field="person_id" mandatory="1" default_val="1" auto_inc="1" />
                        <table_field id="26" si_row_field="name" mandatory="1" />
                        <table_index id="27" si_name="primary" index_type="UNIQUE">
                            <table_index_field id="28" si_field="person_id" />
                        </table_index>
                        <table_index id="29" si_name="ak_alternate_id" index_type="UNIQUE">
                            <table_index_field id="30" si_field="alternate_id" />
                        </table_index>
                        <table_index id="31" si_name="fk_father" index_type="FOREIGN" f_table="person">
                            <table_index_field id="32" si_field="father_id" f_field="person_id" />
                        </table_index>
                        <table_index id="33" si_name="fk_mother" index_type="FOREIGN" f_table="person">
                            <table_index_field id="34" si_field="mother_id" f_field="person_id" />
                        </table_index>
                    </table>
                    <view id="35" si_name="person_with_parents" view_type="JOINED" row_data_type="person_with_parents">
                        <view_src id="36" si_name="self" match="person">
                            <view_src_field id="37" si_match_field="person_id" />
                            <view_src_field id="38" si_match_field="name" />
                            <view_src_field id="39" si_match_field="father_id" />
                            <view_src_field id="40" si_match_field="mother_id" />
                        </view_src>
                        <view_src id="41" si_name="father" match="person">
                            <view_src_field id="42" si_match_field="person_id" />
                            <view_src_field id="43" si_match_field="name" />
                        </view_src>
                        <view_src id="44" si_name="mother" match="person">
                            <view_src_field id="45" si_match_field="person_id" />
                            <view_src_field id="46" si_match_field="name" />
                        </view_src>
                        <view_field id="47" si_row_field="self_id" src_field="[person_id,self]" />
                        <view_field id="48" si_row_field="self_name" src_field="[name,self]" />
                        <view_field id="49" si_row_field="father_id" src_field="[person_id,father]" />
                        <view_field id="50" si_row_field="father_name" src_field="[name,father]" />
                        <view_field id="51" si_row_field="mother_id" src_field="[person_id,mother]" />
                        <view_field id="52" si_row_field="mother_name" src_field="[name,mother]" />
                        <view_join id="53" lhs_src="self" rhs_src="father" join_op="LEFT">
                            <view_join_field id="54" lhs_src_field="father_id" rhs_src_field="person_id" />
                        </view_join>
                        <view_join id="55" lhs_src="self" rhs_src="mother" join_op="LEFT">
                            <view_join_field id="56" lhs_src_field="mother_id" rhs_src_field="person_id" />
                        </view_join>
                    </view>
                </schema>
            </catalog>
            <application id="57" si_name="Gene App">
                <catalog_link id="58" si_name="editor_link" target="Gene Database" />
                <routine id="60" si_name="fetch_pwp" routine_type="FUNCTION" return_cont_type="RW_ARY" return_row_data_type="person_with_parents">
                    <routine_arg id="61" si_name="login_name" cont_type="SCALAR" scalar_data_type="login_auth" />
                    <routine_arg id="62" si_name="login_pass" cont_type="SCALAR" scalar_data_type="login_auth" />
                    <routine_var id="63" si_name="conn_cx" cont_type="CONN" conn_link="editor_link" />
                    <routine_stmt id="64" call_sroutine="CATALOG_OPEN">
                        <routine_expr id="65" call_sroutine_cxt="CONN_CX" cont_type="CONN" valf_p_routine_item="conn_cx" />
                        <routine_expr id="66" call_sroutine_arg="LOGIN_NAME" cont_type="SCALAR" valf_p_routine_item="login_name" />
                        <routine_expr id="67" call_sroutine_arg="LOGIN_PASS" cont_type="SCALAR" valf_p_routine_item="login_pass" />
                    </routine_stmt>
                    <routine_var id="68" si_name="pwp_ary" cont_type="RW_ARY" row_data_type="person_with_parents" />
                    <routine_stmt id="69" call_sroutine="SELECT">
                        <view id="70" si_name="query_pwp" view_type="ALIAS" row_data_type="person_with_parents">
                            <view_src id="71" si_name="s" match="[person_with_parents,Gene Schema,Gene Database]" />
                        </view>
                        <routine_expr id="72" call_sroutine_cxt="CONN_CX" cont_type="CONN" valf_p_routine_item="conn_cx" />
                        <routine_expr id="73" call_sroutine_arg="SELECT_DEFN" cont_type="ROS_M_NODE" act_on="query_pwp" />
                        <routine_expr id="74" call_sroutine_arg="INTO" query_dest="pwp_ary" cont_type="RW_ARY" />
                    </routine_stmt>
                    <routine_stmt id="75" call_sroutine="CATALOG_CLOSE">
                        <routine_expr id="76" call_sroutine_cxt="CONN_CX" cont_type="CONN" valf_p_routine_item="conn_cx" />
                    </routine_stmt>
                    <routine_stmt id="77" call_sroutine="RETURN">
                        <routine_expr id="78" call_sroutine_arg="RETURN_VALUE" cont_type="RW_ARY" valf_p_routine_item="pwp_ary" />
                    </routine_stmt>
                </routine>
                <routine id="79" si_name="add_people" routine_type="PROCEDURE">
                    <routine_context id="80" si_name="conn_cx" cont_type="CONN" conn_link="editor_link" />
                    <routine_arg id="81" si_name="person_ary" cont_type="RW_ARY" row_data_type="person" />
                    <routine_stmt id="82" call_sroutine="INSERT">
                        <view id="83" si_name="insert_people" view_type="INSERT" row_data_type="person" ins_p_routine_item="person_ary">
                            <view_src id="84" si_name="s" match="[person,Gene Schema,Gene Database]" />
                        </view>
                        <routine_expr id="85" call_sroutine_cxt="CONN_CX" cont_type="CONN" valf_p_routine_item="conn_cx" />
                        <routine_expr id="86" call_sroutine_arg="INSERT_DEFN" cont_type="ROS_M_NODE" act_on="insert_people" />
                    </routine_stmt>
                </routine>
                <routine id="87" si_name="get_person" routine_type="FUNCTION" return_cont_type="ROW" return_row_data_type="person">
                    <routine_context id="88" si_name="conn_cx" cont_type="CONN" conn_link="editor_link" />
                    <routine_arg id="89" si_name="arg_person_id" cont_type="SCALAR" scalar_data_type="entity_id" />
                    <routine_var id="90" si_name="person_row" cont_type="ROW" row_data_type="person" />
                    <routine_stmt id="91" call_sroutine="SELECT">
                        <view id="92" si_name="query_person" view_type="JOINED" row_data_type="person">
                            <view_src id="93" si_name="s" match="[person,Gene Schema,Gene Database]">
                                <view_src_field id="94" si_match_field="person_id" />
                            </view_src>
                            <view_expr id="95" view_part="WHERE" cont_type="SCALAR" valf_call_sroutine="EQ">
                                <view_expr id="96" call_sroutine_arg="LHS" cont_type="SCALAR" valf_src_field="[person_id,s]" />
                                <view_expr id="97" call_sroutine_arg="RHS" cont_type="SCALAR" valf_p_routine_item="arg_person_id" />
                            </view_expr>
                        </view>
                        <routine_expr id="98" call_sroutine_cxt="CONN_CX" cont_type="CONN" valf_p_routine_item="conn_cx" />
                        <routine_expr id="99" call_sroutine_arg="SELECT_DEFN" cont_type="ROS_M_NODE" act_on="query_person" />
                        <routine_expr id="100" call_sroutine_arg="INTO" query_dest="person_row" cont_type="RW_ARY" />
                    </routine_stmt>
                    <routine_stmt id="101" call_sroutine="RETURN">
                        <routine_expr id="102" call_sroutine_arg="RETURN_VALUE" cont_type="ROW" valf_p_routine_item="person_row" />
                    </routine_stmt>
                </routine>
            </application>
        </blueprints>
        <tools>
            <data_storage_product id="103" si_name="SQLite v3.2" product_code="SQLite_3_2" is_file_based="1" />
            <data_storage_product id="104" si_name="MySQL v5.0" product_code="MySQL_5_0" is_network_svc="1" />
            <data_storage_product id="105" si_name="PostgreSQL v8" product_code="PostgreSQL_8" is_network_svc="1" />
            <data_storage_product id="106" si_name="Oracle v10g" product_code="Oracle_10_g" is_network_svc="1" />
            <data_storage_product id="107" si_name="Sybase" product_code="Sybase" is_network_svc="1" />
            <data_storage_product id="108" si_name="CSV" product_code="CSV" is_file_based="1" />
            <data_link_product id="109" si_name="Microsoft ODBC v3" product_code="ODBC_3" />
            <data_link_product id="110" si_name="Oracle OCI*8" product_code="OCI_8" />
            <data_link_product id="111" si_name="Generic Rosetta Engine" product_code="Rosetta::Engine::Generic" />
        </tools>
        <sites>
            <catalog_instance id="112" si_name="test" blueprint="Gene Database" product="PostgreSQL v8">
                <user id="113" si_name="ronsealy" user_type="SCHEMA_OWNER" match_owner="Lord of the Root" password="K34dsD" />
                <user id="114" si_name="joesmith" user_type="DATA_EDITOR" password="fdsKJ4" />
            </catalog_instance>
            <application_instance id="115" si_name="test app" blueprint="Gene App">
                <catalog_link_instance id="116" blueprint="editor_link" product="Microsoft ODBC v3" target="test" local_dsn="keep_it" />
            </application_instance>
            <catalog_instance id="117" si_name="production" blueprint="Gene Database" product="Oracle v10g">
                <user id="118" si_name="florence" user_type="SCHEMA_OWNER" match_owner="Lord of the Root" password="0sfs8G" />
                <user id="119" si_name="thainuff" user_type="DATA_EDITOR" password="9340sd" />
            </catalog_instance>
            <application_instance id="120" si_name="production app" blueprint="Gene App">
                <catalog_link_instance id="121" blueprint="editor_link" product="Oracle OCI*8" target="production" local_dsn="ship_it" />
            </application_instance>
            <catalog_instance id="122" si_name="laptop demo" blueprint="Gene Database" product="SQLite v3.2" file_path="Move It" />
            <application_instance id="123" si_name="laptop demo app" blueprint="Gene App">
                <catalog_link_instance id="124" blueprint="editor_link" product="Generic Rosetta Engine" target="laptop demo" />
            </application_instance>
        </sites>
        <circumventions />
    </root>

String SQL That Can Be Made From the Model

This section has examples of string-SQL that can be generated from the above model. The examples are conformant by default to the SQL:2003 standard flavor, but will vary from there to make illustration simpler; some examples may contain a hodge-podge of database vendor extensions and as a whole won't execute as is on some database products.

These two examples for creating the same TABLE schema object, separated by a blank line, demonstrate SQL for a database that supports DOMAIN schema objects and SQL for a database that does not. They both assume that uniqueness and foreign key constraints are only enforced on not-null values.

    CREATE DOMAIN entity_id AS INTEGER(9);
    CREATE DOMAIN alt_id AS VARCHAR(20);
    CREATE DOMAIN person_name AS VARCHAR(100);
    CREATE DOMAIN person_sex AS ENUM('M','F');
    CREATE TABLE person (
        person_id entity_id NOT NULL DEFAULT 1 AUTO_INCREMENT,
        alternate_id alt_id NULL,
        name person_name NOT NULL,
        sex person_sex NULL,
        father_id entity_id NULL,
        mother_id entity_id NULL,
        CONSTRAINT PRIMARY KEY (person_id),
        CONSTRAINT UNIQUE (alternate_id),
        CONSTRAINT fk_father FOREIGN KEY (father_id) REFERENCES person (person_id),
        CONSTRAINT fk_mother FOREIGN KEY (mother_id) REFERENCES person (person_id)
    );

    CREATE TABLE person (
        person_id INTEGER(9) NOT NULL DEFAULT 1 AUTO_INCREMENT,
        alternate_id VARCHAR(20) NULL,
        name VARCHAR(100) NOT NULL,
        sex ENUM('M','F') NULL,
        father_id INTEGER(9) NULL,
        mother_id INTEGER(9) NULL,
        CONSTRAINT PRIMARY KEY (person_id),
        CONSTRAINT UNIQUE (alternate_id),
        CONSTRAINT fk_father FOREIGN KEY (father_id) REFERENCES person (person_id),
        CONSTRAINT fk_mother FOREIGN KEY (mother_id) REFERENCES person (person_id)
    );

This example is for creating the VIEW schema object:

    CREATE VIEW person_with_parents AS
    SELECT self.person_id AS self_id, self.name AS self_name,
        father.person_id AS father_id, father.name AS father_name,
        mother.person_id AS mother_id, mother.name AS mother_name
    FROM person AS self
        LEFT OUTER JOIN person AS father ON father.person_id = self.father_id
        LEFT OUTER JOIN person AS mother ON mother.person_id = self.father_id;

If the 'get_person' routine were implemented as a database schema object, this is what it might look like:

    CREATE FUNCTION get_person (arg_person_id INTEGER(9)) RETURNS ROW(...) AS
    BEGIN
        DECLARE person_row ROW(...);
        SELECT * INTO person_row FROM person AS s WHERE s.person_id = arg_person_id;
        RETURN person_row;
    END;

Then it could be invoked elsewhere like this:

    my_rec = get_person( '3' );

If the same routine were implemented as an application-side routine, then it might look like this (not actual DBI syntax):

    my $sth = $dbh->prepare( 'SELECT * FROM person AS s WHERE s.person_id = :arg_person_id' );
    $sth->bind_param( 'arg_person_id', 'INTEGER(9)' );
    $sth->execute( { 'arg_person_id' => '3' } );
    my $my_rec = $sth->fetchrow_hashref();

And finally, corresponding DROP statements can be made for any of the above database schema objects:

    DROP DOMAIN entity_id;
    DROP DOMAIN alt_id;
    DROP DOMAIN person_name;
    DROP DOMAIN person_sex;
    DROP TABLE person;
    DROP VIEW person_with_parents;
    DROP FUNCTION get_person;

See also the separately distributed Rosetta::Utility::SQLBuilder module, which is a reference implementation of a SQL:2003 (and more) generator for Rosetta::Model.

DESCRIPTION

The Rosetta::Model (ROS M) Perl 5 module provides a container object that allows you to create specifications for any type of database task or activity (eg: queries, DML, DDL, connection management) that look like ordinary routines (procedures or functions) to your programs; all routine arguments are named.

Rosetta::Model is trivially easy to install, since it is written in pure Perl and it has few external dependencies.

Typical usage of this module involves creating or loading a single Rosetta::Model::Container object when your program starts up; this Container would hold a complete representation of each database catalog that your program uses (including details of all schema objects), plus complete representations of all database invocations by your program; your program then typically just reads from the Container while active to help determine its actions.

Rosetta::Model can broadly represent, as an abstract syntax tree (a cross-referenced hierarchy of nodes), code for any programming language, but many of its concepts are only applicable to relational databases, particularly SQL understanding databases. It is reasonable to expect that a SQL:2003 compliant database should be able to implement nearly all Rosetta::Model concepts in its SQL stored procedures and functions, though SQL:2003 specifies some of these concepts as optional features rather than core features.

This module has a multi-layered API that lets you choose between writing fairly verbose code that performs faster, or fairly terse code that performs slower.

Rosetta::Model is intended to be used by an application in place of using actual SQL strings (including support for placeholders). You define any desired actions by stuffing atomic values into Rosetta::Model objects, and then pass those objects to a compatible bridging engine that will compile and execute those objects against one or more actual databases. Said bridge would be responsible for generating any SQL or Perl code necessary to implement the given ROS M routine specification, and returning the result of its execution.

The 'Rosetta' database portability library (a Perl 5 module) is a database bridge that takes its instructions as Rosetta::Model objects. There may be other modules that use Rosetta::Model for that or other purposes.

Rosetta::Model is also intended to be used as an intermediate representation of schema definitions or other SQL that is being translated from one database product to another.

This module is loosely similar to SQL::Statement, and is intended to be used in all of the same ways. But Rosetta::Model is a lot more powerful and capable than that module, as I most recently understand it, and is suitable for many uses that the other module isn't.

Rosetta::Model does not parse or generate any code on its own, nor does it talk to any databases; it is up to external code that uses it to do this.

To cut down on the size of the Rosetta::Model module itself, some of the POD documentation is in these other files: Rosetta::Language, Rosetta::EnumTypes, Rosetta::NodeTypes.

CLASSES IN THIS MODULE

This module is implemented by several object-oriented Perl 5 packages, each of which is referred to as a class. They are: Rosetta::Model (the module's name-sake), Rosetta::Model::Container (aka Container, aka Model), Rosetta::Model::Node (aka Node), and Rosetta::Model::Group (aka Group). This module also has 2 private classes named Rosetta::Model::ContainerStorage and Rosetta::Model::NodeStorage, which help to implement Container and Node respectively; each of the latter is a wrapper for one of the former.

While all 6 of the above classes are implemented in one module for convenience, you should consider all 6 names as being "in use"; do not create any modules or packages yourself that have the same names.

The Container and Node and Group classes do most of the work and are what you mainly use. The name-sake class mainly exists to guide CPAN in indexing the whole module, but it also provides a set of stateless utility methods and constants that the other two classes inherit, and it provides a few wrapper functions over the other classes for your convenience; you never instantiate an object of Rosetta::Model itself.

Most of the Rosetta::Model documentation you will see simply uses the terms 'Container' and 'Node' to refer to the pair of classes or objects which implements each as a single unit, even if said documentation is specific to the 'Storage' variants thereof, because someone using this module shouldn't need to know the difference. This said, some documentation will specify a pair member by appending the terms 'interface' and 'Storage'; "Container interface" refers to ::Container, "ContainerStorage" refers to ::ContainerStorage, "Node interface" refers to ::Node, "NodeStorage" refers to ::NodeStorage.

MATTERS OF PORTABILITY AND FEATURES

Rosetta::Models are intended to represent all kinds of SQL, both DML and DDL, both ANSI standard and RDBMS vendor extensions. Unlike basically all of the other SQL generating/parsing modules I know about, which are limited to basic DML and only support table definition DDL, this class supports arbitrarily complex select statements, with composite keys and unions, and calls to stored functions; this class can also define views and stored procedures and triggers. Some of the existing modules, even though they construct complete SQL, will take/require fragments of SQL as input (such as "where" clauses) By contrast, Rosetta::Model takes no SQL fragments. All of its inputs are atomic, which means it is also easier to analyse the objects for implementing a wider range of functionality than previously expected; for example, it is much easier to analyse any select statement and generate update/insert/delete statements for the virtual rows fetched with it (a process known as updateable views).

Considering that each database product has its own dialect of SQL which it implements, you would have to code SQL differently depending on which database you are using. One common difference is the syntax for specifying an outer join in a select query. Another common difference is how to specify that a table column is an integer or a boolean or a character string. Moreover, each database has a distinct feature set, so you may be able to do tasks with one database that you can't do with another. In fact, some databases don't support SQL at all, but have similar features that are accessible thorough alternate interfaces. Rosetta::Model is designed to represent a normalized superset of all database features that one may reasonably want to use. "Superset" means that if even one database supports a feature, you will be able to invoke it with this class. You can also reference some features which no database currently implements, but it would be reasonable for one to do so later. "Normalized" means that if multiple databases support the same feature but have different syntax for referencing it, there will be exactly one way of referring to it with Rosetta::Model. So by using this class, you will never have to change your database-using code when moving between databases, as long as both of them support the features you are using (or they are emulated). That said, it is generally expected that if a database is missing a specific feature that is easy to emulate, then code which evaluates Rosetta::Models will emulate it (for example, emulating "left()" with "substr()"); in such cases, it is expected that when you use such features they will work with any database. For example, if you want a model-specified BOOLEAN data type, you will always get it, whether it is implemented on a per-database-basis as a "boolean" or an "int(1)" or a "number(1,0)". Or a model-specified "STR_CHAR" data type you will always get it, whether it is called "text" or "varchar2" or "sql_varchar".

Rosetta::Model is intended to be just a stateless container for database query or schema information. It does not talk to any databases by itself and it does not generate or parse any SQL; rather, it is intended that other third party modules or code of your choice will handle this task. In fact, Rosetta::Model is designed so that many existing database related modules could be updated to use it internally for storing state information, including SQL generating or translating modules, and schema management modules, and modules which implement object persistence in a database. Conceptually speaking, the DBI module itself could be updated to take Rosetta::Model objects as arguments to its "prepare" method, as an alternative (optional) to the SQL strings it currently takes. Code which implements the things that Rosetta::Model describes can do this in any way that they want, which can mean either generating and executing SQL, or generating Perl code that does the same task and evaling it, should they want to (the latter can be a means of emulation). This class should make all of that easy.

Rosetta::Model is especially suited for use with applications or modules that make use of data dictionaries to control what they do. It is common in applications that they interpret their data dictionaries and generate SQL to accomplish some of their work, which means making sure generated SQL is in the right dialect or syntax, and making sure literal values are escaped correctly. By using this module, applications can simply copy appropriate individual elements in their data dictionaries to Rosetta::Model properties, including column names, table names, function names, literal values, host parameter names, and they don't have to do any string parsing or assembling.

Now, I can only imagine why all of the other SQL generating/parsing modules that I know about have excluded privileged support for more advanced database features like stored procedures. Either the authors didn't have a need for it, or they figured that any other prospective users wouldn't need it, or they found it too difficult to implement so far and maybe planned to do it later. As for me, I can see tremendous value in various advanced features, and so I have included privileged support for them in Rosetta::Model. You simply have to work on projects of a significant size to get an idea that these features would provide a large speed, reliability, and security savings for you. Look at many large corporate or government systems, such as those which have hundreds of tables or millions of records, and that may have complicated business logic which governs whether data is consistent/valid or not. Within reasonable limits, the more work you can get the database to do internally, the better. I believe that if these features can also be represented in a database-neutral format, such as what Rosetta::Model attempts to do, then users can get the full power of a database without being locked into a single vendor due to all their investment in vendor-specific SQL stored procedure code. If customers can move a lot more easily, it will help encourage database vendors to keep improving their products or lower prices to keep their customers, and users in general would benefit. So I do have reasons for trying to tackle the advanced database features in Rosetta::Model.

STRUCTURE

The internal structure of a Rosetta::Model object is conceptually a cross between an XML DOM and an object-relational database, with a specific schema. This module is implemented with two main classes that work together, Containers and Nodes. The Container object is an environment or context in which Node objects usually live. A typical application will only need to create one Container object (returned by the module's 'new_container' function), and then a set of Nodes which live within that Container. The Nodes are related sometimes with single or multiple cardinality to each other.

Rosetta::Model is expressly designed so that its data is easy to convert between different representations, mainly in-memory data structures linked by references, and multi-table record sets stored in relational databases, and node sets in XML documents. A Container corresponds to an XML document or a complete database, and each Node corresponds to an XML node or a database record. Each Node has a specific node_type (a case-sensitive string), which corresponds to a database table or an XML tag name. See the Rosetta::Language documentation file to see which ones exist. The node_type is set when the Node is created and it can not be changed later.

A Node has a specific set of allowed attributes that are determined by the node_type, each of which corresponds to a database table column or an XML node attribute. Every Node has a unique 'id' attribute (a positive integer) by which it is primarily referenced; that attribute corresponds to the database table's single-column primary key, with the added distinction that every primary key value in each table is distinct from every primary key value in every other table. Each other Node attribute is either a scalar value of some data type, or an enumerated value, or a reference to another Node of a specific node_type, which has a foreign-key constraint on it; those 3 attribute types are referred to respectively as "literal", "enumerated", and "Node-ref". Foreign-key constraints are enforced by this module, so you will have to add Nodes in the appropriate order, just as when adding records to a database. Any Node which is referenced in an attribute (cited in a foreign-key constraint) of another is a parent of the other; as a corollary, the second Node is a child of the first. The order of child Nodes under a parent is the same as that in which the parent-child relationship was assigned, unless you have afterwards used the move_before_sibling() method to change this.

The order of child Nodes under a parent is often significant, so it is important to preserve this sequence explicitly if you store a Node set in an RDBMS, since databases do not consider record order to be significant or worth remembering; you would add extra columns to store sequence numbers. You do not have to do any extra work when storing Nodes in XML, however, because XML does consider node order to be significant and will preserve it.

When the terms "parent" and "child" are used by Rosetta::Model in regards to the relationships between Nodes, they are used within the context that a given Node can have multiple parent Nodes; a given Node X is the parent of another given Node Y because a Node-ref attribute of Node Y points to Node X. Another term, "primary parent", refers to the parent Node of a given Node Z that is referenced by Z's "pp" Node-ref attribute, which most Node types have. When Rosetta::Models are converted to a purely hierarchical or tree representation, such as to XML, the primary parent Node becomes the single parent XML node. For example, the XML parent of a 'routine_var' Node is always a 'routine' Node, even though a 'scalar_domain' Node may also be referenced. Nodes of a few types, such as 'view_expr', can have primary parent Nodes that are of the same Node type, and can form trees of their own type; however, Nodes of most types can only have Nodes of other types as their primary parents.

Some Node types do not have a "pp" attribute and Nodes of those types never have primary parent Nodes; rather, all Nodes of those types will always have a specific pseudo-Node as their primary parents; pseudo-Node primary parents are not referenced in any attribute, and they can not be changed. All 6 pseudo-Nodes have no attributes, even 'id', and only one of each exists; they are created by default with the Container they are part of, forming the top 2 levels of the Node tree, and can not be removed. They are: 'root' (the single level-1 Node which is parent to the other pseudo-Nodes but no normal Nodes), 'elements' (parent to 'scalar_data_type' and 'row_data_type' and 'external_cursor' Nodes), 'blueprints' (parent to 'catalog' and 'application' Nodes), 'tools' (parent to 'data_storage_product' and 'data_link_product' Nodes), 'sites' (parent to 'catalog_instance' and 'application_instance' Nodes), and 'circumventions' (parent to 'sql_fragment' nodes). All other Node types have normal Nodes as primary parents.

You should look at the POD-only file named Rosetta::Language, which comes with this distribution. It serves to document all of the possible Node types, with attributes, constraints, and allowed relationships with other Node types. As the Rosetta::Model class itself has very few properties and methods, all being highly generic (much akin to an XML DOM), the POD of this PM file will only describe how to use said methods, and will not list all the allowed inputs or constraints to said methods. With only simple guidance in Routine.pm, you should be able to interpret Language.pod to get all the nitty gritty details. You should also look at the tutorial or example files which will be in the distribution when ready. You could also learn something from the code in or with other modules which sub-class or use this one.

RELATING INTERFACE AND STORAGE CLASSES

The outwardly visible structure of Rosetta::Model models has them composed of just 2 classes, called Container and Node; all Nodes meant to be used together are arranged in the same tree and that tree lives in a single Container. It is possible and normal for external code such as other modules or applications to hold references to any Container or Node object at any given time, and query or manipulate the Node tree by way of that held object; these references are always one-way; no Container or Node will hold a reciprocal reference to something external at any time. When one or more strong external direct reference to just a Container or just one of its Nodes exists, both the Container and all of its Nodes are stable and survive; when the last such external reference goes away, both the Container and all Nodes within are garbage collected. It is as if there are circular strong references between a Container and its Nodes, though in reality that isn't the case, so no explicit model destructor is provided.

In order for Rosetta::Model to provide some features in the most elegant fashion possible, the conceptual Container and Node objects are each composed of 2 actual objects, 1 being visible and 1 being hidden. The visible portion is referred to as the conceptual object's "interface", and the hidden portion its "Storage".

The implementation of the ContainerStorage and NodeStorage objects exactly matches the outwardly visible and conceptual structure insofar as its purpose, function, and arrangement, but differs by survivability requirements. The destiny of a NodeStorage is tied to its ContainerStorage. The Perl references from ContainerStorages to their NodeStorages are strong, but those from NodeStorages to their ContainerStorages are weak; when the last Container interface's reference to a ContainerStorage goes away, both the Container and all Nodes within are garbage collected.

The Container interface and Node interface objects are implemented as wrappers over their Storage counterparts; each 'interface' object primarily has a single 'storage' property that is a Storage object reference. It is mandatory that external code can only hold references to interface objects, and not to Storage objects.

A Container interface object's reference to its Storage is a strong Perl reference, and any reciprocal references are weak; hence, the survivability of a ContainerStorage, and its NodeStorage tree, is wholly dependent on the Container interface that references it; similarly, the survivability of said Container interface depends on strong external references to either it or one of its Node interfaces. There can be any number of Container interface objects that share the same ContainerStorage object; certain Rosetta::Model features apply to individual Container interface objects in isolation from each other. Each such Container interface is blind to the individuals among its sharing peers, and you can not use a method on one to obtain any of its peers. You create the first Container interface in tandem with its ContainerStorage using the new_container() function; additional peer interfaces can be created by invoking the new_interface() method on an existing Container interface that is to be shared with. Any Container interface will disappear when external refs go away, but the ContainerStorage persists as long as the last interface.

Paralleling the Storage side, every Node interface belongs to just a single Container interface. Node interfaces are the most transient of the 4 object types; all Perl references between them and the other 3 object types are weak, except that the Perl reference from a Node interface to a Container interface is strong, so they last only as long as external strong references to them. Node interfaces do not necessarily exist for every NodeStorage (a Container interface first comes into existence with zero of them), and are created only on demand when external code wishes to hold on to a Node; moreover, new ones are created every time a public Rosetta::Model method returns Nodes, even when Node interfaces had already been created under the same Container interface for the same NodeStorages. An alternate way to create a new Container interface is to invoke the new_interface() method on a Node interface; when you do that, a peer to the invocant Node interface's Container interface is created, as well as a single Node interface within it that references the same NodeStorage as the invocant Node interface.

To repeat, while the strong Perl references on the Storage side point only from the Container to each Node, it is the exact opposite on the interface side, where they point only from each Node to their Container. However, the net effect is that an entire conceptual Container plus Node tree will survive regardless of whether external code holds on to just a conceptual Container or a conceptual Node.

Due to the fact that a single conceptual Container or Node is represented by Rosetta::Model's public API as a multitude of Container and Node objects, it is invalid for you to do a straight compare of the objects you hold to test for their equality, eg, "$foo_node eq $bar_node". Instead, you will need to use the get_self_id() method on any Container or Node objects you wish to compare in order for the comparison to be reliable. The same rule applies any time you use a Container or Node object as a Hash key.

ABOUT NODE GROUPS

CAVEAT: THE FEATURES DESCRIBED IN THIS SECTION ARE ONLY PARTLY IMPLEMENTED.

Rosetta::Model also has the concept of Node Groups (Groups), which are user defined arbitrary collections of Nodes, all in the same Container, on which certain batch activities can be performed. You access this concept using Rosetta::Model::Group (Group) objects, each of which is attached to a single Container interface (there is no GroupStorage class). Currently, the main use of a group is to associate the same access sanction with a group of Nodes, such as to declare that they are all in a read-only state. Any Node can belong to multiple Groups, both held by the same Container interface or different Container interfaces. Adding sanctions to or removing them from a Group will affect all Nodes in it simultaneously; also, if it is invalid to apply a sanction to any particular Node (usually due to competing sanctions), then applying it to the entire Group will fail with an exception. Adding a Node to a Group that already carries sanctions will try to apply those sanctions to the Node; the addition will fail if this isn't possible; likewise, the reverse for removing a Node from a Group. A Group's defined sanctions will persist until either they are explicitly removed by that Group or the Group object is garbage collected. See also the "Access Controls" documentation section below.

FAULT TOLERANCE, DATA INTEGRITY, AND CONCURRENCY

Disclaimer: The following claims assume that only this module's published API is used, and that you do not set object properties directly or call private methods, such as is possible using XS code or a debugger. It also assumes that the module is bug free, and that any errors or warnings which appear while the code is running are thrown explicitly by this module as part of its normal functioning.

Rosetta::Model is designed to ensure that the objects it produces are always internally consistent, and that the data they contain is always well-formed, regardless of the circumstances in which it is used. You should be able to fetch data from the objects at any time and that data will be self-consistent and well-formed.

Rosetta::Model also has several features to improve the reliability of its data models when those models are used concurrently (in a non-threaded fashion) by multiple program components and modules that may not be considerate of each other, or that may run into trouble part way through a multi-part model update. By using these features, you can save yourself from the substantial hassle of implementing the same checks and balances externally; you can program more lazily without some of the consequences. These features are intended primarily to stop accidental interference between program components, usually the result of programmer oversights or short cuts. To a lesser extent, the features are also designed to prevent intentional disruption, by giving an exclusive capability key of sorts to the program component that added a protection, which is required for removing it.

Structural Matters

This module does not use package variables at all, besides Perl 5 constants like $VERSION, and all symbols ($@%) declared at file level are strictly constant value declarations. No object should ever step on another.

Also, the separation of conceptual objects into Storage and interface components helps to prevent certain accidents resulting from bad code.

Function or Method Types and Exceptions

All Rosetta::Model functions and methods, except a handful of Container object boolean property accessors, are either "getters" (which read and return or generate values but do not change the state of anything) or "setters" (which change the state of something but do not return anything on success); none do getting or setting conditionally based on their arguments. While this means there are more methods in total, I see this arrangement as being more stable and reliable, plus each method is simpler and easier to understand or use; argument lists and possible return values are also less variable and more predictable.

All "setter" functions or methods which are supposed to change the state of something will throw an exception on failure (usually from being given bad arguments); on success, they officially have no return values. A thrown exception will always include details of what went wrong (and where and how) in a machine-readable (and generally human readable) format, so that calling code which catches them can recover gracefully. The methods are all structured so that, at least per individual Node attribute, they check all preconditions prior to changing any state information. So one can assume that upon throwing an exception, the Node and Container objects are in a consistent or recoverable state at worst, and are completely unchanged at best.

All "getter" functions or methods will officially return the value or construct that was asked for; if said value doesn't (yet or ever) exist, then this means the Perl "undefined" value. When given bad arguments, generally this module's "information" functions will return the undefined value, and all the other functions/methods will throw an exception like the "setter" functions do.

Input Validation

Generally speaking, if Rosetta::Model throws an exception, it means one of two things: 1. Your own code is not invoking it correctly, meaning you have something to fix; 2. You have decided to let it validate some of your input data for you (which is quite appropriate).

Rosetta::Model objects will not lose their well-formedness regardless of what kind of bad input data you provide to object methods or module functions. Providing bad input data will cause the module to throw an exception; if you catch this and the program continues running (such as to chide the user and have them try entering correct input), then the objects will remain un-corrupted and able to accept new input or give proper output. In all cases, the object will be in the same state as it was before the public method was called with the bad input.

Note that, while Rosetta::Model objects are always internally consistent, that doesn't mean that the data they store is always correct. Only the most critical kinds of input validation, constantly applied constraints, are done prior to storing or altering the data. Many kinds of data validation, deferrable constraints, are only done as a separate stage following its input; you apply those explicitly by invoking assert_deferrable_constraints(). One reason for the separation is to avoid spurious exceptions such as 'mandatory attribute not set' while a Node is in the middle of being created piecemeal, such as when it is storing details gathered one at a time from the user; another reason is performance efficiency, to save on a lot of redundant computation.

Note also that Rosetta::Model is quite strict in its own argument checking, both for internal simplicity and robustness, and so that code which *reads* data from it can be simpler. If you want your own program to be more liberal in what input it accepts, then you will have to bear the burden of cleaning up or interpreting that input, or delegating such work elsewhere. (Or perhaps someone may want to make a wrapper module to do this?)

ACID Compliance and Transactions

CAVEAT: THE FEATURES DESCRIBED IN THIS SECTION ARE ONLY PARTLY IMPLEMENTED.

Rosetta::Model objects natively have all aspects of ACID (Atomicity, Consistency, Isolation, Durability) compliance that are possible for something that exists soley in volatile RAM; that is, nothing aside from a power outage, or violation of its API (by employing XS code or a debugger), or non externally synchronized access from multiple threads, should nullify its being ACID compliant.

Rosetta::Model has a transactional interface, with a by-the-Node locking granularity. Multiple transactions can be active concurrently, and they all have serializable isolation from each other; no one transaction will see the changes made to a model by another until they are committed.

Each Rosetta::Model Container interface represents one distinct, concurrent transaction against a Container and its Nodes. To start a new transaction, which happens to be nested, invoke new_interface(). Note that this will only succeed if invoked on a Container object or on a Node that is committed; invoking it on an un-committed Node will throw an exception, since the new transaction won't be allowed to see the Node interface it is supposed to return.

Invoking commit_transaction() on a Container or Node will preserve any yet un-committed changes made by it, so they become visible to all transactions; invoking rollback_transaction() on the same will undo all un-committed changes; invoking set_transaction_savepoint() will divide the previous and subsequent un-committed changes so that the latter can be rolled back independently of the former. Note that, if any Container interface is garbage collected while it has un-committed changes, those changes will all be implicitly rolled back.

Every externally invoked function and method is implicitly atomic; it will either succeed completely, or leave no changes behind when it throws an error exception, regardless of the complexity of the task. For simpler cases, such as single-attribute changes, this is implemented by simply validating all preconditions before making any changes. For more complex cases, such as multiple-attribute changes, this is implemented by setting a new savepoint prior to attempting any of the component changes, and rolling back to it if any of the component changes failed.

Note that, if you only ever have a single Container interface for a given ContainerStorage, then you don't have to issue commits for visibility, because all Nodes will be visible and editable to you anyway.

Access Controls and Reentrancy

CAVEAT: THE FEATURES DESCRIBED IN THIS SECTION ARE ONLY PARTLY IMPLEMENTED.

By default, all Nodes in a Container are visible and editable through all Container interfaces, which works well when your program is simple and/or well implemented so that no part of it or modules that it uses step on each other.

For other situations, Rosetta::Model provides several types of access controls that permit user programs and modules to get some protection against interference from each other, whether due to accident of lazy programming or bad policies; they can force arbitrarily large parts of a Rosetta::Model model to be reentrant for as long as needed. These access control features are provided by Group objects, and apply to member Nodes of those Groups.

You use the write-prevention access controls when you want to guarantee read consistency for yourself, or have made other assumptions based on a Node that you expect to stay true for awhile, such as when you have cached some derived data from a Node group that you don't want to go stale. In other words, this feature lets you avoid "dirty reads" of the model with the least trouble.

Invoking impose_write_block() on a Group will prevent anyone from editing or deleting its member Nodes, whether by way of the Group's own Container interface or any other Container interface. This block will not prevent the editing or deletion of any member Nodes' non-member ancestor or descendent Nodes; if you need that protection, such as because most Nodes are partly defined by their relatives, add them to this group too, or to another write-blocked group. As a complement to this first block type, invoking impose_child_addition_block() or impose_reference_addition_block() on a Group will prevent anyone from adding, respectively, new primary-child Nodes or new link-child Nodes, to its member Nodes, whether by way of the Group's own Container interface or any other Container interface. Between the 3 block types, you can block additions, edits, and deletes by anyone. A Node can belong to any number of Groups at the same time which impose any combination of these 3 block types, and the Node can be read by all Container interfaces. You can remove blocks by invoking the respective Group methods remove_write_block(), remove_child_addition_block(), or remove_reference_addition_block().

The Node method move_before_sibling() is different in nature than any of the above circumstances; for now it will be blocked if the SELF or SIBLING Node has a 'write block', or if the PARENT has a 'child/reference addition block' (whichever is appropriate for the type of PARENT given). This is because that method is conceptually used just after a parent-child connection is made.

The mutex access control is for when you are editing some Nodes and you don't want others to either read them in an inconsistent state or to attempt editing them at the same time.

Invoking impose_mutex() on a Group will prevent all Container interfaces except for the Group's own from seeing its member Nodes, or editing or deleting them, or adding child Nodes to them. This access control can not be used in concert with the previous 3 write-prevention access controls; a Group must disable any of those 3 controls that it holds before it can enable the mutex. A Node can also only belong to a single Group when that group is imposing a mutex. You can remove the mutex by invoking the Group method remove_mutex().

Mutex controls are used implicitly by transactions. Any Node that you create or edit or delete must be held by a mutex-imposing Group belonging to the same Container interface as the transaction. If the Node is held by any other Group, the write action will fail with an exception. If the Node is held by no Group at all, it will automatically be added to a default mutex-imposing Group that is held by the Container interface. Any Node that has un-committed changes may not be removed from its Group, nor may the Group have its mutex-imposition disabled; you must either commit or rollback those changes first. Note that, following a commit or rollback, all Nodes in a Container interface's default mutex-imposing Group will be removed from it, so they are visible to anyone. By contrast, any other mutex-imposing Groups will retain their members following a commit or rollback, so they remain hidden from everyone else. Note that any transaction commit/rollback/savepoint activity will affect all Groups under it.

Multiple Thread Concurrency

Rosetta::Model is explicitly not thread-aware (thread-safe); it contains no code to synchronize access to its objects' properties, such as semaphores or locks or mutexes. To internalize the details of such things in an effective manner would have made the code a lot more complex than it is now, with few clear benefits. However, this module can be used in multi-threaded environments where the application/caller code takes care of synchronizing access to its objects, especially if the application uses coarse-grained read or write locks, by locking an entire Container at once.

But be aware that the current structure of Rosetta::Model objects, where the ContainerStorage you actually want to synchronize on isn't accessible to you, prevents you from using the Rosetta::Model object itself as the semaphore. Therefore, you have to maintain a separate single variable of your own, used in conjunction with all related Container interfaces, which you use as a semaphore or monitor or some such. Even then, you may encouter problems, such as that Perl doesn't actually share all Nodes in a Container with multiple threads when their Container is shared; or maybe this problem won't happen. Still, the author has never actually tried to use multiple threads.

The author's expectation is that this module will be mainly used in circumstances where the majority of actions are reads, and there are very few writes, such as with a data dictionary; perhaps all the writes on an object may be when it is first created. An application thread would obtain a read lock/semaphore on a Container object during the period for which it needs to ensure read consistency; it would block write lock attempts but not other read locks. It would obtain a write lock during the (usually short) period it needs to change something, which blocks all other lock attempts (for read or write).

An example of this is a web server environment where each page request is being handled by a distinct thread, and all the threads share one Rosetta::Model object; normally the object is instantiated when the server starts, and the worker threads then read from it for guidance in using a common database. Occasionally a thread will want to change the object, such as to correspond to a simultaneous change to the database schema, or to the web application's data dictionary that maps the database to application screens. Under this situation, the application's definitive data dictionary (stored partly or wholly in a Rosetta::Model) can occupy one place in RAM visible to all threads, and each thread won't have to keep looking somewhere else such as in the database or a file to keep up with the definitive copy. (Of course, any *changes* to the in-memory data dictionary should see a corresponding update to a non-volatile copy, like in an on-disk database or file.)

Note that, while a nice thing to do may be to manage a course-grained lock in Rosetta::Model, with the caller invoking lock_to_read() or lock_to_write() or unlock() methods on it, Perl's thread->lock() mechanism is purely context based; the moment lock_to_...() returns, the object has unlocked again. Of course, if you know a clean way around this, I would be happy to hear it.

NODE IDENTITY ATTRIBUTES

Every Node has a positive integer 'id' attribute whose value is distinct among every Node in the same Container, without regard for the Node type. These 'id' attributes are used internally by Rosetta::Model when linking child and parent Nodes, but they have no use at all in any SQL statement strings that are generated for a typical database engine. Rather, Rosetta::Model defines a second, "surrogate id" ("SI") attribute for most Node types whose value actually is used in SQL statement strings, and corresponds to the SQL:2003 concept of a "SQL identifier" (such as a table name or a table column name). This attribute name varies by the Node type, but always looks like "si_*"; at least half of the time, the exact name is "si_name". If the SI attribute of a given Node is a literal or enumerated attribute, then its value is used directly for SQL strings; if it is a Node-ref attribute, then the SI attribute of the Node that it points to is used likewise for SQL strings in place of the given Node (this will recurse until we hold a Node whose SI attribute is not a Node-ref). In this way, Rosetta::Model stores the "SQL identifier" value of each Node exactly once, and if it is edited then all SQL references to it automatically update.

To make Rosetta::Model easier to use in the common cases, this module will treat all Node types as having a surrogate id attribute. In the few cases where a Node type has no distinct attribute for it (such as "view_expr" Nodes), this module will automatically use the 'id' (Node Id) attribute instead.

When the surrogate identifier attribute of a single Node is referenced on its own, this corresponds to the SQL:2003 concept of an "unqualified identifier". Under trivial circumstances, a Node can only be referenced by another Node using the former's unqualified identifier if either the first Node is a member of the second Node's primary-parent chain, or the first Node is a sibling of any member of said pp chain, or the first node has a pseudo-Node primary parent.

Rosetta::Model implements context sensitive constraints for what values a given Node's surrogate identifier can have, and in what contexts other Nodes must be located wherein it is valid for them to link to the given Node and be its children; these go beyond the earlier-applied, more basic constraints that restrict parent-child connections based on the Node types and/or presence in the same Container of the Nodes involved. The rules for SI values are similar to the rules in typical programming languages concerning the declaration of variables and the scope in which they are visible to be referenced. One constraint is that all Nodes which share the same primary parent Node or pseudo-Node must have distinct SI values with respect to each other, regardless of Node type (eg, all columns and indexes in a table must have distinct names). Another constraint is described for general circumstances that in order for one given Node to reference another as its parent Node, that other Node must be either belong to the given Node's primary-parent chain, or it must be a direct sibling of either the given Node or a Node in its primary-parent chain, or it must be a primary-child of one of the primary parent Node's parent Nodes.

NOTE THAT FURTHER DOCUMENTATION ABOUT LINKING BY SURROGATE IDS IS PENDING.

CONSTRUCTOR WRAPPER FUNCTIONS

These functions are stateless and can be invoked off of either the module name, or any package name in this module, or any object created by this module; they are thin wrappers over other methods and exist strictly for convenience.

new_container()

    my $model = Rosetta::Model->new_container();
    my $model2 = Rosetta::Model::Container->new_container();
    my $model3 = Rosetta::Model::Node->new_container();
    my $model4 = Rosetta::Model::Group->new_container();
    my $model5 = $model->new_container();
    my $model6 = $node->new_container();
    my $model7 = $group->new_container();

This function wraps Rosetta::Model::Container->new().

new_node( CONTAINER, NODE_TYPE[, NODE_ID] )

    my $node = Rosetta::Model->new_node( $model, 'table' );
    my $node2 = Rosetta::Model::Container->new_node( $model, 'table' );
    my $node3 = Rosetta::Model::Node->new_node( $model, 'table', 32 );
    my $node4 = Rosetta::Model::Group->new_node( $model, 'table_field', 6 );
    my $node5 = $model->new_node( $model, 'table', 45 );
    my $node6 = $node->new_node( $model, 'table' );
    my $node7 = $group->new_node( $model, 'view' );

This function wraps Rosetta::Model::Node->new( CONTAINER, NODE_TYPE, NODE_ID ).

new_group()

    my $group = Rosetta::Model->new_group();
    my $group2 = Rosetta::Model::Container->new_group();
    my $group3 = Rosetta::Model::Node->new_group();
    my $group4 = Rosetta::Model::Group->new_group();
    my $group5 = $model->new_group();
    my $group6 = $node->new_group();
    my $group7 = $group->new_group();

This function wraps Rosetta::Model::Group->new().

CONTAINER CONSTRUCTOR FUNCTIONS

This function is stateless and can be invoked off of either the Container class name or an existing Container object, with the same result.

new()

    my $model = Rosetta::Model::Container->new();
    my $model2 = $model->new();

This "getter" function will create and return a single Container object.

CONTAINER OBJECT METHODS

These methods are stateful and may only be invoked off of Container objects.

new_interface()

This method creates and returns a new Container interface object that is a ContainerStorage sharing peer of the invocant Container interface object.

get_self_id()

This method returns a character string value that distinctly represents this Container interface object's inner ContainerStorage object; you can use it to see if 2 Container interface objects have the same ContainerStorage object common to them, which conceptually means that the two Container interface objects are in fact one and the same.

auto_assert_deferrable_constraints([ NEW_VALUE ])

This method returns this Container interface's "auto assert deferrable constraints" boolean property; if NEW_VALUE is defined, it will first set that property to it. When this flag is true, Rosetta::Model's build_*() methods will automatically invoke assert_deferrable_constraints() on each Node newly created by way of this Container interface (or child Node interfaces), prior to returning it. The use of this method helps isolate bad input bugs faster by flagging them closer to when they were created; it is especially useful with the build*tree() methods.

auto_set_node_ids([ NEW_VALUE ])

This method returns this Container interface's "auto set node ids" boolean property; if NEW_VALUE is defined, it will first set that property to it. When this flag is true, Rosetta::Model will automatically generate and set a Node Id for a Node being created for this Container interface when there is no explicit Id given as a Node.new() argument. When this flag is false, a missing Node Id argument will cause an exception to be raised instead.

may_match_surrogate_node_ids([ NEW_VALUE ])

This method returns this Container interface's "may match surrogate node ids" boolean property; if NEW_VALUE is defined, it will first set that property to it. When this flag is true, Rosetta::Model will accept a wider range of input values when setting Node ref attribute values, beyond Node object references and integers representing Node ids to look up; if other types of values are provided, Rosetta::Model will try to look up Nodes based on their Surrogate Id attribute, usually 'si_name', before giving up on finding a Node to link.

delete_node_tree()

This "setter" method will delete all of the Nodes in this Container. The semantics are like invoking delete_node() on every Node in the appropriate order, except for being a lot faster.

get_child_nodes([ NODE_TYPE ])

    my $ra_node_list = $model->get_child_nodes();
    my $ra_node_list = $model->get_child_nodes( 'catalog' );

This "getter" method returns a list of this Container's primary-child Nodes, in a new array ref. A Container's primary-child Nodes are defined as being all Nodes in the Container whose Node Type defines them as always having a pseudo-Node parent. If the optional argument NODE_TYPE is defined, then only child Nodes of that Node Type are returned; otherwise, all child Nodes are returned. All Nodes are returned in the same order they were added.

find_node_by_id( NODE_ID )

    my $node = $model->find_node_by_id( 1 );

This "getter" method searches for a member Node of this Container whose Node Id matches the NODE_ID argument; if one is found then it is returned by reference; if none is found, then undef is returned. Since Rosetta::Model guarantees that all Nodes in a Container have a distinct Node Id, there will never be more than one Node returned. The speed of this search is also very fast, and takes the same amount of time regardless of how many Nodes are in the Container.

find_child_node_by_surrogate_id( TARGET_ATTR_VALUE )

    my $data_type_node = $model->find_child_node_by_surrogate_id( 'str100' );

This "getter" method searches for a Node in this Container whose Surrogate Node Id matches the TARGET_ATTR_VALUE argument; if one is found then it is returned by reference; if none is found, then undef is returned. The TARGET_ATTR_VALUE argument is treated as an array-ref; if it is in fact not one, that single value is used as a single-element array. The search is multi-generational, one generation more childwards per array element. If the first array element is undef, then we assume it follows the format that Node.get_surrogate_id_chain() outputs; the first 3 elements are [undef,'root',<l2-psn>] and the 4th element matches a Node that has a pseudo-Node parent. If the first array element is not undef, we treat it as the aforementioned 4th element. If a valid <l2-psn> is extracted, then only child Nodes of that pseudo-Node are searched; otherwise, the child Nodes of all pseudo-Nodes are searched.

get_next_free_node_id()

    my $node_id = $model->get_next_free_node_id();

This "getter" method returns an integer which is valid for use as the Node ID of a new Node that is going to be put in this Container. Its value is 1 higher than the highest Node ID for any Node that is already in the Container, or had been before; in a brand new Container, its value is 1. You can use this method like a sequence generator to produce Node Ids for you rather than you producing them in some other way. An example situation when this method might be useful is if you are building a Rosetta::Model by scanning the schema of an existing database. This property will never decrease in value during the life of the Container, and it can not be externally edited.

get_edit_count()

    my $count_sample = $model->get_edit_count();

This "getter" method will return the integral "edit count" property of this Container, which counts the changes to this Container's Node set. Its value starts at zero in a brand new Container, and is incremented by 1 each time a Node is edited in, added to, or deleted from the Container. The actual value of this property isn't important; rather the fact that it did or didn't change between two arbitrary samplings is what's important; it lets the sampler know that something changed since the previous sample was taken. This property will never decrease in value during the life of the Container, and it can not be externally edited. This feature is designed to assist external code that caches information derived from a Rosetta::Model model, such as generated SQL strings or Perl closures, so that it can easily tell when the cache may have become stale (leading to a cache flush).

deferrable_constraints_are_tested()

    my $is_all_ok = $model->deferrable_constraints_are_tested();

This "getter" method returns true if, following the last time this Container's Node set was edited, the Container.assert_deferrable_constraints() method had completed all of its tests without finding any problems (meaning that all Nodes in this Container are known to be free of all data errors, both individually and collectively); this method returns false if any of the tests failed, or the test suite was never run (including with a new empty Container, since a Container completely devoid of Nodes may violate a deferrable constraint).

assert_deferrable_constraints()

    $model->assert_deferrable_constraints();

This "getter" method implements several types of deferrable data validation, to make sure that every Node in this Container is ready to be used, both individually and collectively; it throws an exception if it can find anything wrong. Note that a failure with any one Node will cause the testing of the whole set to abort, as the offending Node throws an exception which this method doesn't catch; any untested Nodes could also have failed, so you will have to re-run this method after fixing the problem. This method will short-circuit and not perform any tests if this Container's "deferrable constraints are tested" property is equal to its "edit count" property, so to avoid unnecessary repeated tests due to redundant external invocations; this allows you to put validation checks for safety everywhere in your program while avoiding a corresponding performance hit. This method will update "deferrable constraints are tested" to match "edit count" when all tests pass; it isn't updated at any other time.

NODE CONSTRUCTOR FUNCTIONS

This function is stateless and can be invoked off of either the Node class name or an existing Node object, with the same result.

new( CONTAINER, NODE_TYPE[, NODE_ID] )

    my $node = Rosetta::Model::Node->new( $model, 'table' );
    my $node2 = Rosetta::Model::Node->new( $model, 'table', 27 );
    my $node3 = $node->new( $model, 'table' );
    my $node4 = $node->new( $model, 'table', 42 );

This "getter" function will create and return a single Node object that lives in the Container object given in the CONTAINER argument; the new Node will live in that Container for its whole life; if you want to conceptually move it to a different Container, you must clone the Node and then delete the old one. The Node Type of the new Node is given in the NODE_TYPE (enum) argument, and it can not be changed for this Node later; only specific values are allowed, which you can see in the Rosetta::Language documentation file. The Node Id can be explicitly given in the NODE_ID (uint) argument, which is a mandatory argument unless the host Container has a true "auto set node ids" property, in which case a Node Id will be generated from a sequence if the argument is not given. All of the Node's other properties are defaulted to an "empty" state.

NODE OBJECT METHODS

These methods are stateful and may only be invoked off of Node objects.

new_interface()

This method creates and returns a new Node interface object that is a NodeStorage sharing peer of the invocant Node interface object, and it also lives in a new Container interface object.

get_self_id()

This method returns a character string value that distinctly represents this Node interface object's inner NodeStorage object; you can use it to see if 2 Node interface objects have the same NodeStorage object common to them, which conceptually means that the two Node interface objects are in fact one and the same.

delete_node()

This "setter" method will destroy the Node object that it is invoked from, if it can; it does this by clearing all references between this Node and its parent Nodes and its Container, whereby it can then be garbage collected. You are only allowed to delete Nodes that don't have child Nodes; failing this, you must unlink the children from it or delete them first. After invoking this method you should let your external reference to the Node expire so the reminants are garbage collected.

delete_node_tree()

This "setter" method is like delete_node() except that it will also delete all of the invocant's primary descendant Nodes. Prior to deleting any Nodes, this method will first assert that every deletion candidate does not have any child (referencing) Nodes which live outside the tree being deleted. So this method will either succeed entirely or have no lasting effects. Note that this method, as well as delete_node(), is mainly intended for use by long-lived applications that continuously generate database commands at run-time and only use each one once, such as a naive interactive SQL shell, so to save on memory use; by contrast, many other kinds of applications have a limited and pre-determined set of database commands that they use, and they may not need this method at all.

get_container()

    my $model = $node->get_container();

This "getter" method returns the Container object which this Node lives in.

get_node_type()

    my $type = $node->get_node_type();

This "getter" method returns the Node Type scalar (enum) property of this Node. You can not change this property on an existing Node, but you can set it on a new one.

get_node_id()

This "getter" method will return the integral Node Id property of this Node.

set_node_id( NEW_ID )

This "setter" method will replace this Node's Id property with the new value given in NEW_ID if it can; the replacement will fail if some other Node with the same Node Id already exists in the same Container.

get_primary_parent_attribute()

    my $parent = $node->get_primary_parent_attribute();

This "getter" method returns the primary parent Node of the current Node, if there is one.

clear_primary_parent_attribute()

This "setter" method will clear this Node's primary parent attribute value, if it has one.

set_primary_parent_attribute( ATTR_VALUE )

This "setter" method will set or replace this Node's primary parent attribute value, if it has one, giving it the new value specified in ATTR_VALUE.

get_surrogate_id_attribute([ GET_TARGET_SI ])

This "getter" method will return the value for this Node's surrogate id attribute. The GET_TARGET_SI argument is relevant only for Node-ref attributes; its effects are explained by get_attribute().

clear_surrogate_id_attribute()

This "setter" method will clear this Node's surrogate id attribute value, unless it is 'id', in which case it throws an exception instead.

set_surrogate_id_attribute( ATTR_VALUE )

This "setter" method will set or replace this Node's surrogate id attribute value, giving it the new value specified in ATTR_VALUE.

get_attribute( ATTR_NAME[, GET_TARGET_SI] )

    my $curr_val = $node->get_attribute( 'si_name' );

This "getter" method will return the value for this Node's attribute named in the ATTR_NAME argument, if it is set, or undef if it isn't. With Node-ref attributes, the returned value is a Node ref by default; if the optional boolean argument GET_TARGET_SI is true, then this method will instead lookup (recursively) and return the target Node's literal or enumerated surrogate id value (a single value, not a chain).

get_attributes([ GET_TARGET_SI ])

    my $rh_attrs = $node->get_attributes();

This "getter" method will fetch all of this Node's attributes, returning them in a Hash ref. Each attribute value is returned in the format specified by get_attribute( <attr-name>, GET_TARGET_SI ).

clear_attribute( ATTR_NAME )

This "setter" method will clear this Node's attribute named in the ATTR_NAME argument, unless ATTR_NAME is 'id'; since the Node Id attribute has a constantly applied mandatory constraint, this method will throw an exception if you try. With Node-ref attributes, the other Node being referred to will also have its child list reciprocal link to the current Node cleared.

clear_attributes()

This "setter" method will clear all of this Node's attributes, except 'id'; see the clear_attribute() documentation for the semantics.

set_attribute( ATTR_NAME, ATTR_VALUE )

This "setter" method will set or replace this Node's attribute named in the ATTR_NAME argument, giving it the new value specified in ATTR_VALUE; in the case of a replacement, the semantics of this method are like invoking clear_attribute( ATTR_NAME ) before setting the new value. With Node-ref attributes, ATTR_VALUE may either be a perl reference to a Node, or a Node Id value, or a relative Surrogate Node Id value (scalar or array ref); the last one will only work if this Node is in a Container that matches Nodes by surrogate ids. With Node-ref attributes, when setting a new value this method will also add the current Node to the other Node's child list.

set_attributes( ATTRS )

    $node->set_attributes( $rh_attrs );

This "setter" method will set or replace multiple Node attributes, whose names and values are specified by keys and values of the ATTRS hash ref argument; this method behaves like invoking set_attribute() for each key/value pair.

move_before_sibling( SIBLING[, PARENT] )

This "setter" method allows you to change the order of child Nodes under a common parent Node; specifically, it moves the current Node to a position just above/before the sibling Node specified in the SIBLING Node ref argument, if it can. Since a Node can have multiple parent Nodes (and the sibling likewise), the optional PARENT argument lets you specify which parent's child list you want to move within; if you do not provide an PARENT value, then the current Node's primary parent Node (or pseudo-Node) is used, if possible. This method will throw an exception if the current Node and the specified sibling or parent Nodes are not appropriately related to each other (parent <-> child). If you want to move the current Node to follow the sibling instead, then invoke this method on the sibling.

get_child_nodes([ NODE_TYPE ])

    my $ra_node_list = $table_node->get_child_nodes();
    my $ra_node_list = $table_node->get_child_nodes( 'table_field' );

This "getter" method returns a list of this Node's primary-child Nodes, in a new array ref. If the optional argument NODE_TYPE is defined, then only child Nodes of that Node Type are returned; otherwise, all child Nodes are returned. All Nodes are returned in the same order they were added.

add_child_node( CHILD )

    $node->add_child_node( $child );

This "setter" method allows you to add a new primary-child Node to this Node, which is provided as the single CHILD Node ref argument. The new child Node is appended to the list of existing child Nodes, and the current Node becomes the new or first primary parent Node of CHILD.

add_child_nodes( CHILDREN )

    $model->add_child_nodes( [$child1,$child2] );

This "setter" method takes an array ref in its single CHILDREN argument, and calls add_child_node() for each element found in it. This method does not return anything.

get_referencing_nodes([ NODE_TYPE ])

    my $ra_node_list = $row_data_type_node->get_referencing_nodes();
    my $ra_node_list = $row_data_type_node->get_referencing_nodes( 'table' );

This "getter" method returns a list of this Node's link-child Nodes (which are other Nodes that refer to this one in a non-PP nref attribute) in a new array ref. If the optional argument NODE_TYPE is defined, then only child Nodes of that Node Type are returned; otherwise, all child Nodes are returned. All Nodes are returned in the same order they were added.

get_surrogate_id_chain()

This "getter" method returns the current Node's surrogate id chain as an array ref. This method's return value is conceptually like what get_relative_surrogate_id() returns except that it is defined in an absolute context rather than a relative context, making it less suitable for serializing models that are subsequently reorganized; it is also much longer.

find_node_by_surrogate_id( SELF_ATTR_NAME, TARGET_ATTR_VALUE )

    my $table_node = $table_index_node->find_node_by_surrogate_id( 'f_table', 'person' );

This "getter" method's first argument SELF_ATTR_NAME must match a valid Node-ref attribute name of the current Node (but not the 'pp' attribute); it throws an exception otherwise. This method searches for member Nodes of this Node's host Container whose type makes them legally linkable by the SELF_ATTR_NAME attribute and whose Surrogate Node Id matches the TARGET_ATTR_VALUE argument (which may be either a scalar or a Perl array ref); if one or more are found in the same scope then they are all returned by reference, within an array ref; if none are found, then undef is returned. The search is context sensitive and scoped; it starts by examining the Nodes that are closest to this Node's location in its Container's Node tree and then spreads outwards, looking within just the locations that are supposed to be visible to this Node. Variations on this search method come into play for Node types that are connected to the concepts of "wrapper attribute", "ancestor attribute correlation", and "remotely addressable types". Note that it is illegal to use the return value of get_surrogate_id_chain() as a TARGET_ATTR_VALUE with this method; you can give the return value of get_relative_surrogate_id(), however. Note that counting the number of return values from find_node_by_surrogate_id() is an accurate way to tell whether set_attribute() with the same 2 arguments will succeed or not; it will only succeed if this method returns exactly 1 matching Node.

find_child_node_by_surrogate_id( TARGET_ATTR_VALUE )

    my $table_index_node = $table_node->find_child_node_by_surrogate_id( 'fk_father' );

This "getter" method searches for a primary-child Node of the current Node whose Surrogate Node Id matches the TARGET_ATTR_VALUE argument; if one is found then it is returned by reference; if none is found, then undef is returned. The TARGET_ATTR_VALUE argument is treated as an array-ref; if it is in fact not one, that single value is used as a single-element array. The search is multi-generational, one generation more childwards per array element; the first element matches a child of the invoked Node, the next element the child of the first matched child, and so on. If said array ref has undef as its first element, then this method behaves the same as Container.find_child_node_by_surrogate_id( TARGET_ATTR_VALUE ).

get_relative_surrogate_id( SELF_ATTR_NAME[, WANT_SHORTEST] )

    my $longer_table_si_value = $table_index_node->get_relative_surrogate_id( 'f_table' );
    my $shorter_table_si_value = $table_index_node->get_relative_surrogate_id( 'f_table', 1 );
    my $longer_view_src_col_si_value = $view_expr_node->get_relative_surrogate_id( 'name' );
    my $shorter_view_src_col_si_value = $view_expr_node->get_relative_surrogate_id( 'name', 1 );

This "getter" method's first argument SELF_ATTR_NAME must match a valid Node-ref attribute name of the current Node (but not the 'pp' attribute); it throws an exception otherwise. This method returns a Surrogate Node Id value (which may be either a scalar or a Perl array ref) which is appropriate for passing to find_node_by_surrogate_id() as its TARGET_ATTR_VALUE such that the current attribute value would be "found" by it, assuming the current attribute value is valid; if none can be determined, then undef is returned. This method's return value is conceptually like what get_surrogate_id_chain() returns except that it is defined in a relative context rather than an absolute context, making it more suitable for serializing models that are subsequently reorganized; it is also much shorter. Depending on context circumstances, get_relative_surrogate_id() can possibly return either fully qualified (longest), partly qualified, or unqualified (shortest) Surrogate Id values for certain Node-ref attributes. By default, it will return the fully qualified Surrogate Id in every case, which is the fastest to determine and use, and which has the most resiliency against becoming an ambiguous reference when other parts of the ROS M model are changed to have the same surrogate id attribute value as the target. If the optional boolean argument WANT_SHORTEST is set true, then this method will produce the most unqualified Surrogate Id possible that is fully unambiguous within the current state of the ROS M model; this version can be more resilient against breaking when parts of the ROS M model being moved around relative to each other, as long as there are no duplicate surrogate id attributes in the model, at the cost of being slower to determine and use. WANT_SHORTEST has no effect with the large fraction of Node refs whose fully qualified name is a single element.

assert_deferrable_constraints()

This "getter" method implements several types of deferrable data validation, to make sure that this Node is ready to be used; it throws an exception if it can find anything wrong.

GROUP CONSTRUCTOR FUNCTIONS

This function is stateless and can be invoked off of either the Group class name or an existing Group object, with the same result.

new( CONTAINER )

    my $group = Rosetta::Model::Group->new( $model );
    my $group2 = $group->new( $model );

This "getter" function will create and return a single Group object that is associated with the Container interface object given in the CONTAINER argument. The new Group has no member Nodes, and does not impose any sanctions.

GROUP OBJECT METHODS

These methods are stateful and may only be invoked off of Node objects.

TODO: ADD THESE METHODS.

CONTAINER OR NODE METHODS FOR DEBUGGING

The following 3 "getter" methods can be invoked either on Container or Node objects, and will return a tree-arranged structure having the contents of a Node and all its children (to the Nth generation). If you invoke the 3 methods on a Node, then that Node will be the root of the returned structure. If you invoke them on a Container, then a few pseudo-Nodes will be output with all the normal Nodes in the Container as their children.

    $rh_node_properties = $container->get_all_properties();
    $rh_node_properties = $container->get_all_properties( 1 );
    $rh_node_properties = $container->get_all_properties( 1, 1 );
    $rh_node_properties = $node->get_all_properties();
    $rh_node_properties = $node->get_all_properties( 1 );
    $rh_node_properties = $node->get_all_properties( 1, 1 );

This method returns a deep copy of all of the properties of this object as non-blessed Perl data structures. These data structures are also arranged in a tree, but they do not have any circular references. Each node in the returned tree, which represents a single Node or pseudo-Node, consists of an array-ref having 3 elements: a scalar having the Node type, a hash-ref having the Node attributes, and an array-ref having the child nodes (one per array element). For each Node, all attributes are output, including 'id', except for the 'pp' attribute, which is redundant; the value of the 'pp' attribute can be determined from an output Node's context, as it is equal to the 'id' of the parent Node. The main purpose, currently, of get_all_properties() is to make it easier to debug or test this class; it makes it easier to see at a glance whether the other class methods are doing what you expect. The output of this method should also be easy to serialize or unserialize to strings of Perl code or xml or other things, should you want to compare your results easily by string compare; see "get_all_properties_as_perl_str()" and "get_all_properties_as_xml_str()". If the optional boolean argument LINKS_AS_SI is true, then each Node ref attribute will be output as the target Node's surrogate id value, as returned by get_relative_surrogate_id( <attr-name>, WANT_SHORTEST ), if it has a valued surrogate id attribute; if the argument is false, or a Node doesn't have a valued surrogate id attribute, then its Node id will be output by default. The output can alternately be passed as the CHILDREN parameter (when first wrapped in an array-ref) to build_container() so that the method creates a clone of the original Container, where applicable; iff the output was generated using LINKS_AS_SI, then build_container() will need to be given a true MATCH_SURR_IDS argument (its other boolean args can all be false).

    $perl_code_str = $container->get_all_properties_as_perl_str();
    $perl_code_str = $container->get_all_properties_as_perl_str( 1 );
    $perl_code_str = $container->get_all_properties_as_perl_str( 1, 1 );
    $perl_code_str = $node->get_all_properties_as_perl_str();
    $perl_code_str = $node->get_all_properties_as_perl_str( 1 );
    $perl_code_str = $node->get_all_properties_as_perl_str( 1, 1 );

This method is a wrapper for get_all_properties( LINKS_AS_SI ) that serializes its output into a pretty-printed string of Perl code, suitable for humans to read. You should be able to eval this string and produce the original structure.

    $xml_doc_str = $container->get_all_properties_as_xml_str();
    $xml_doc_str = $container->get_all_properties_as_xml_str( 1 );
    $xml_doc_str = $container->get_all_properties_as_xml_str( 1, 1 );
    $xml_doc_str = $node->get_all_properties_as_xml_str();
    $xml_doc_str = $node->get_all_properties_as_xml_str( 1 );
    $xml_doc_str = $node->get_all_properties_as_xml_str( 1, 1 );

This method is a wrapper for get_all_properties( LINKS_AS_SI ) that serializes its output into a pretty-printed string of XML, suitable for humans to read.

CONTAINER OR NODE FUNCTIONS AND METHODS FOR RAPID DEVELOPMENT

The following 6 "setter" functions and methods should assist more rapid development of code that uses Rosetta::Model, at the cost that the code would run a bit slower (Rosetta::Model has to search for info behind the scenes that it would otherwise get from you). These methods are implemented as wrappers over other Rosetta::Model methods, and allow you to accomplish with one method call what otherwise requires about 4-10 method calls, meaning your code base is significantly smaller (unless you implement your own simplifying wrapper functions, which is recommended in some situations).

Note that when a subroutine is referred to as a "function", it is stateless and can be invoked off of either a class name or class object; when a subroutine is called a "method", it can only be invoked off of Container or Node objects.

build_node( NODE_TYPE[, ATTRS] )

    my $nodeP = $model->build_node( 'catalog', { 'id' => 1, } );
    my $nodeN = $node->build_node( 'catalog', { 'id' => 1, } );

This method will create and return a new Node, that lives in the invocant Container or in the same Container as the invocant Node, whose type is specified in NODE_TYPE, and also set its attributes. The ATTRS argument is mainly processed by Node.set_attributes() if it is provided; a Node id ('id') can also be provided this way, or it will be generated if that is allowed. ATTRS may also contain a 'pp' if applicable. If ATTRS is defined but not a Hash ref, then this method will build a new one having a single element, where the value is ATTRS and the key is either 'id' or the new Node's surrogate id attribute name, depending on whether the value looks like a valid Node id.

build_child_node( NODE_TYPE[, ATTRS] )

This method is like build_node() except that it will set the new Node's primary parent to be the Node that this method was invoked on, using add_child_node(); if this method was invoked on a Container, then it will work only for new Nodes that would have a pseudo-Node as their primary parent. When creating a Node with this method, you do not set any PP candidates in ATTRS (if a 'pp' attribute is given in ATTRS, it will be ignored).

build_child_nodes( CHILDREN )

This method takes an array ref in its single CHILDREN argument, and calls build_child_node() for each element found in it; each element of CHILDREN must be an array ref whose elements correspond to the arguments of build_child_node(). This method does not return anything.

build_child_node_tree( NODE_TYPE[, ATTRS][, CHILDREN] )

This method is like build_child_node() except that it will recursively create all of the child Nodes of the new Node as well; build_child_node_trees( CHILDREN ) will be invoked following the first Node's creation (if applicable). In the context of Rosetta::Model, a "Node tree" or "tree" consists of one arbitrary Node and all of its "descendants". If invoked on a Container object, this method will recognize any pseudo-Node names given in 'NODE_TYPE' and simply move on to creating the child Nodes of that pseudo-Node, rather than throwing an error exception for an invalid Node type. Therefore, you can populate a whole Container with one call to this method. This method returns the root Node that it creates, if NODE_TYPE was a valid Node type; it returns the Container instead if NODE_TYPE is a pseudo-Node name.

build_child_node_trees( CHILDREN )

This method takes an array ref in its single CHILDREN argument, and calls build_child_node_tree() for each element found in it; each element of CHILDREN must be an array ref whose elements correspond to the arguments of build_child_node_tree(). This method does not return anything.

build_container([ CHILDREN[, AUTO_ASSERT[, AUTO_IDS[, MATCH_SURR_IDS]]] ])

When called with no arguments, this function is like new_container(), in that it will create and return a new Container object; if the array ref argument CHILDREN is set, then this function also behaves like build_child_node_trees( CHILDREN ) but that all of the newly built Nodes are held in the newly built Container. If any of the optional boolean arguments [AUTO_ASSERT, AUTO_IDS, MATCH_SURR_IDS] are true, then the corresponding flag properties of the new Container will be set to true prior to creating any Nodes. This function is almost the exact opposite of Container.get_all_properties(); you should be able to take the Array-ref output of Container.get_all_properties(), wrap that in another Array-ref (to produce a single-element list of Array-refs), give it to build_container(), and end up with a clone of the original Container.

INFORMATION FUNCTIONS

These "getter" functions are all intended for use by programs that want to dynamically interface with Rosetta::Model, especially those programs that will generate a user interface for manual editing of data stored in or accessed through Rosetta::Model constructs. It will allow such programs to continue working without many changes while Rosetta::Model itself continues to evolve. In a manner of speaking, these functions/methods let a caller program query as to what 'schema' or 'business logic' drive this class. These functions/methods are all deterministic and stateless; they can be used in any context and will always give the same answers from the same arguments, and no object properties are used. You can invoke them from any kind of object that Rosetta::Model implements, or straight off of the class name itself, like a 'static' method. All of these functions return the undefined value if they match nothing.

valid_enumerated_types([ ENUM_TYPE ])

This function by default returns a list of the valid enumerated types that Rosetta::Model recognizes; if the optional ENUM_TYPE argument is given, it just returns true if that matches a valid type, and false otherwise.

valid_enumerated_type_values( ENUM_TYPE[, ENUM_VALUE] )

This function by default returns a list of the values that Rosetta::Model recognizes for the enumerated type given in the ENUM_TYPE argument; if the optional ENUM_VALUE argument is given, it just returns true if that matches an allowed value, and false otherwise.

valid_node_types([ NODE_TYPE ])

This function by default returns a list of the valid Node Types that Rosetta::Model recognizes; if the optional NODE_TYPE argument is given, it just returns true if that matches a valid type, and false otherwise.

node_types_with_pseudonode_parents([ NODE_TYPE ])

This function by default returns a Hash ref where the keys are the names of the Node Types whose primary parents can only be pseudo-Nodes, and where the values name the pseudo-Nodes they are the children of; if the optional NODE_TYPE argument is given, it just returns the pseudo-Node for that Node Type.

node_types_with_primary_parent_attributes([ NODE_TYPE ])

This function by default returns a Hash ref where the keys are the names of the Node Types that have a primary parent ("pp") attribute, and where the values are the Node Types that values for that attribute must be; if the optional NODE_TYPE argument is given, it just returns the valid Node Types for the primary parent attribute of that Node Type. Since there may be multiple valid primary parent Node Types for the same Node Type, an array ref is returned that enumerates the valid types; often there is just one element; if the array ref has no elements, then all Node Types are valid.

valid_node_type_literal_attributes( NODE_TYPE[, ATTR_NAME] )

This function by default returns a Hash ref where the keys are the names of the literal attributes that Rosetta::Model recognizes for the Node Type given in the NODE_TYPE argument, and where the values are the literal data types that values for those attributes must be; if the optional ATTR_NAME argument is given, it just returns the literal data type for the named attribute.

valid_node_type_enumerated_attributes( NODE_TYPE[, ATTR_NAME] )

This function by default returns a Hash ref where the keys are the names of the enumerated attributes that Rosetta::Model recognizes for the Node Type given in the NODE_TYPE argument, and where the values are the enumerated data types that values for those attributes must be; if the optional ATTR_NAME argument is given, it just returns the enumerated data type for the named attribute.

valid_node_type_node_ref_attributes( NODE_TYPE[, ATTR_NAME] )

This function by default returns a Hash ref where the keys are the names of the node attributes that Rosetta::Model recognizes for the Node Type given in the NODE_TYPE argument, and where the values are the Node Types that values for those attributes must be; if the optional ATTR_NAME argument is given, it just returns the Node Type for the named attribute. Since there may be multiple valid Node Types for the same attribute, an array ref is returned that enumerates the valid types; often there is just one element; if the array ref has no elements, then all Node Types are valid.

valid_node_type_surrogate_id_attributes([ NODE_TYPE ])

This function by default returns a Hash ref where the keys are the names of the Node Types that have a surrogate id attribute, and where the values are the names of that attribute; if the optional NODE_TYPE argument is given, it just returns the surrogate id attribute for that Node Type. Note that a few Node types don't have distinct surrogate id attributes; for those, this method will return 'id' as the surrogate id attribute name.

DEPENDENCIES

This module requires any version of Perl 5.x.y that is at least 5.8.1.

It also requires the Perl modules version and only, which would conceptually be built-in to Perl, but aren't, so they are on CPAN instead.

It also requires the Perl modules Scalar::Util and List::Util, which would conceptually be built-in to Perl, but are bundled with it instead.

It also requires the Perl module List::MoreUtils '0.12-', which would conceptually be built-in to Perl, but isn't, so it is on CPAN instead.

It also requires these modules that are on CPAN: Locale::KeyedText '1.6.0-' (for error messages).

INCOMPATIBILITIES

None reported.

SEE ALSO

perl(1), Rosetta::Model::L::en, Rosetta::Language, Rosetta::EnumTypes, Rosetta::NodeTypes, Locale::KeyedText, Rosetta, Rosetta::Utility::SQLBuilder, Rosetta::Utility::SQLParser, Rosetta::Engine::Generic, Rosetta::Emulator::DBI, DBI, SQL::Statement, SQL::Parser, SQL::Translator, SQL::YASP, SQL::Generator, SQL::Schema, SQL::Abstract, SQL::Snippet, SQL::Catalog, DB::Ent, DBIx::Abstract, DBIx::AnyDBD, DBIx::DBSchema, DBIx::Namespace, DBIx::SearchBuilder, TripleStore, Data::Table, and various other modules.

BUGS AND LIMITATIONS

This module is currently in alpha development status, meaning that some parts of it will be changed in the near future, some perhaps in incompatible ways; however, I believe that any further incompatible changes will be small. The current state is analogous to 'developer releases' of operating systems; it is reasonable to being writing code that uses this module now, but you should be prepared to maintain it later in keeping with API changes. This module also does not yet have full code coverage in its tests, though the most commonly used areas are covered.

You can not use surrogate id values that look like valid Node ids (that are positive integers) since some methods won't do what you expect when given such values. Nodes having such surrogate id values won't be matched by values passed to set_attribute(), directly or indirectly. That method only tries to lookup a Node by its surrogate id if its argument doesn't look like a Node ref or a Node id. Similarly, the build*() methods will decide whether to interpret a defined but non-Node-ref ATTRS argument as a Node id or a surrogate id based on its looking like a valid Node id or not. You should rarely encounter this caveat, though, since you would never use a number as a "SQL identifier" in normal cases, and that is only technically possible with a "delimited SQL identifier".

AUTHOR

Darren R. Duncan (perl@DarrenDuncan.net)

LICENCE AND COPYRIGHT

This file is part of the Rosetta database portability library.

Rosetta is Copyright (c) 2002-2005, Darren R. Duncan. All rights reserved. Address comments, suggestions, and bug reports to perl@DarrenDuncan.net, or visit http://www.DarrenDuncan.net/ for more information.

Rosetta is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License (GPL) as published by the Free Software Foundation (http://www.fsf.org/); either version 2 of the License, or (at your option) any later version. You should have received a copy of the GPL as part of the Rosetta distribution, in the file named "GPL"; if not, write to the Free Software Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.

Linking Rosetta statically or dynamically with other modules is making a combined work based on Rosetta. Thus, the terms and conditions of the GPL cover the whole combination. As a special exception, the copyright holders of Rosetta give you permission to link Rosetta with independent modules, regardless of the license terms of these independent modules, and to copy and distribute the resulting combined work under terms of your choice, provided that every copy of the combined work is accompanied by a complete copy of the source code of Rosetta (the version of Rosetta used to produce the combined work), being distributed under the terms of the GPL plus this exception. An independent module is a module which is not derived from or based on Rosetta, and which is fully useable when not linked to Rosetta in any form.

Any versions of Rosetta that you modify and distribute must carry prominent notices stating that you changed the files and the date of any changes, in addition to preserving this original copyright notice and other credits. Rosetta is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

While it is by no means required, the copyright holders of Rosetta would appreciate being informed any time you create a modified version of Rosetta that you are willing to distribute, because that is a practical way of suggesting improvements to the standard version.

ACKNOWLEDGEMENTS

Besides myself as the creator ...

* 2004.05.20 - Thanks to Jarrell Dunson (jarrell_dunson@asburyseminary.edu) for inspiring me to add some concrete SYNOPSIS documentation examples to Rosetta::Model, which demonstrate actual SQL statements that can be generated from parts of a model, when he wrote me asking for examples of how to use Rosetta::Model.

* 2005.03.21 - Thanks to Stevan Little (stevan@iinteractive.com) for feedback towards improving Rosetta::Model's documentation, particularly towards using a much shorter SYNOPSIS, so that it is easier for newcomers to understand the module at a glance, and not be intimidated by large amounts of detailed information. Also thanks to Stevan for introducing me to Scalar::Util::weaken(); by using it, Rosetta::Model objects can be garbage collected normally despite containing circular references, and users no longer need to invoke destructor methods.