
Rosetta::Overview - Rosetta compared to DBI

CAVEAT: This document's contents are old, and some details out of date or even woefully incorrect, but it will only be replaced in the indetermined future; meanwhile, many parts and details are still just fine.
This document is an overview of Rosetta DBMS framework, mainly done in terms of comparing and contrasting it with the Perl DBI.
Details of the architecture, such as data structures, allowed inputs, and functional APIs, are discussed in other documents.
Rosetta provides a common API for access to relational databases of all shapes and sizes, including those implemented with libraries embedded in your application (such as SQLite) and those implemented with a client-server architecture (such as Postgres, MySQL or Oracle).
The API is rigorously defined, such that an application written to it should easily port to alternative relational database engines with minimal effort.
This might seem strange to somebody who has not tried to port between databases before, especially given that the Perl DBI purports to provide "Database Independence". However, the level of DBI's provided independence is Database Driver Independence, and not Database Language Independence. To further demonstrate the difference, it is useful to compare the DBI and Rosetta.
Rosetta and the Perl DBI have a lot in common:
Each back-end would usually map the framework API to a pre-existing native API for a particular database manager (such as with PostgreSQL), or it would sometimes implement the framework API itself (such as for CSV files).
DBI calls these back-ends Drivers, since they sit on top of and control the database manager (following a hardware device driver analogy), while Rosetta calls its back-ends Engines, to emphasize that they are effectively its implementations, and be more abstract that they may not be controlling something else.
An application using either to access a database should be able to do all the same useful things they can do using the native API for that database.
Both take care of lower level transport matters and some cross-database portability issues, so that alternate APIs layered on top of them can focus on their added value instead.
With Rosetta,
any meta-data is stored using a database's native mechanisms (such as what CREATE TABLE can specify).
A point of difference here is that if you prefer the database schema to be the "slave",
the object version of the database schema can be stored anywhere else at all - including another database or space in the database.
Having described how similar DBI and Rosetta are, we should now talk about the differences.
Rosetta has a much higher emphasis on cross-database portability than DBI does. It achieves this through a more highly abstracted API, which also allows for a greater level of transformations of queries and schemas.
The most prominant example of this is that DBI does transparent pass-thru of SQL statements, such that the database receives the literal SQL string that the application provides, perhaps with a few limited string substitutions as provided for by the Driver. The application must know about and code to each database manager's proprietary SQL dialect, not the least of which includes the names of data types, formats of dates, and the syntax of outer joins.
By contrast, Rosetta will take its inputs as or process its inputs into an abstract syntax tree (AST) that represents the meaning (and has corresponding structure) to one or more standard SQL:2003 statements, and each Rosetta back-end will translate this into the database manager's proprietary SQL version (assuming a SQL engine is being used). In this way, the application only has to know *what* it wants to tell the database, and not how to express that in the database's native dialect.
That being the summary difference, other differences follow. Rosetta:
These might be converted to positional placeholders when queries are run - it really doesn't matter. This abstraction alone saves a common DBI abstraction implemented by many DBI programmers from being necessary.
This is meant to be a fall-back "circumvention" option for savvy developers that understand the consequences, so that no one actually loses functionality they had with DBI when using Rosetta. This is akin to putting CPU-specific assembly language in your code, and is better as a short term solution rather than a long-term one.
For example, strings are passed around in Unicode, and dates are dealt with in the standard ISO-8601 format natively supported by most Perl Date & Time modules.
Rosetta internally uses native format numbers, such as string coded decimals, which have exact precision and are effectively of unlimited length. They won't lose accuracy in conversion for large values like floating point numbers. Rosetta will return numbers as Perl strings and coerce any input to such as well.
Rosetta natively supports binary data, which it receives and returns exactly as is.
By contrast, DBI passes-thru all data in the way the database sent it, which the application has to know about and specify, and doesn't guarantee particular formats between databases such that would help with portability.
By contrast, the DBI API has some native functions to return some schema info as hashes and arrays, and requires the application to encode SQL that requests the database send most other details in its native format, such as string SQL or information schema rows; the results are heterogeneous.
Rosetta can be implemented as an alternate API over DBI (and the reverse is also true) whose main contribution is SQL generation and parsing, and limited data munging, but Rosetta can also be used independently of DBI; the choice is up to the Rosetta Engine implementer.
The choice between using DBI and using Rosetta seems to be analogous to the choice between the C and Ruby programming languages, respectively, where each database product is analogous to a hardware CPU architecture or wider hardware platform.
The DBI is great for people who like working as close to the metal as possible, with much closer access to (or less abstraction from) each database product's native way of doing things, those who want to talk to their database in its native SQL dialect. Rosetta is more high level, for those who want the write-once run-anywhere experience, less of a burden on their creativity, more development time saving features.
Rosetta has several algorithmic differences from the DBI that can impact performance, so the net performance of your program depends on how you use each of them. Assuming it is used properly (which shouldn't be too difficult), Rosetta can deliver a net performance that is similar to the DBI, so your programs should not become slower when using it, and they may even become faster. For example:

Go to Rosetta for the majority of distribution-internal references, and Rosetta::SeeAlso for the majority of distribution-external references.

Darren R.
Duncan (perl@DarrenDuncan.net)
Some editorial assistance from Sam Vilain (samv@cpan.org).

This file is part of the Rosetta DBMS framework.
Rosetta is Copyright (c) 2002-2006, Darren R. Duncan.
See the LICENCE AND COPYRIGHT of Rosetta for details.

The ACKNOWLEDGEMENTS in Rosetta apply to this file too.