Version 0.1019.
DBD::File is an abstract general purpose driver for treating files as database tables and designed to be subclassed rather than used directly. DBD::CSV is a subclass of DBD::File for working with files that typically store one row per text line with fields separated with a comma, semicolon or tab character, typically called CSV files.
DBD::File
DBD::CSV
The driver author is Jochen Wiedmann. He can be contacted via the dbi-users mailing list.
The DBD::File driver works with the SQL::Statement module, version 0.1011 or later. This module is a simple SQL parser and evaluator. In particular it is restricted to single table queries. Table joins are not supported.
SQL::Statement
The DBD::CSV driver internally uses the Text::CSV_XS module, version 0.16 or later, for reading and writing CSV files.
Text::CSV_XS
It's important to note that while just about everyone thinks they know what the CSV file format is, there is actually no formal definition of the format and there are many subtle differences.
The DBI->connect() Data Source Name, or DSN, can be one of the following:
DBI->connect()
dbi:CSV: dbi:CSV:attrs
where attrs is an optional semicolon-separated list of key=value pairs. Note that you must not use dbi:File:, as the DBD::File driver is an abstract superclass and not usable by itself.
attrs
dbi:File:
Known attributes include:
By default files in the current directory are treated as tables. The attribute f_dir makes the module open files in the given directory.
These attributes are used for describing the CSV file format in use. For example, to open /etc/passwd, which is colon-separated and line-feed terminated, as a table, one would use:
csv_eol=\n;csv_sep_char=:
The defaults are \r\n, comma (,), double-quote ("), and double-quote (") respectively. All of these attributes and defaults are inherited from the Text::CSV_XS module.
\r\n
,
"
Without question, the main disadvantage of the DBD::CSV module is the lack of appropriate type handling. While reading a CSV table you have no way to reliably determine the correct data type of the fields. All fields look like strings and are treated as such by default.
The SQL::Statement module, and hence the DBD::CSV driver, accepts the numeric types INTEGER and REAL in CREATE TABLE statements, but they are always stored as strings and, by default, retrieved as strings.
CREATE TABLE
It is possible to read individual columns as integers or doubles, in which case they are converted to Perl's internal data types IV and NV, integer and numeric value respectively. Unsigned values are not supported.
To assign certain data types to columns, you have to create metadata definitions. The following example reads a table table_name with columns I, N, and P of type integer, double, and string, respectively:
my $dbh = DBI->connect("DBI:CSV:"); $dbh->{csv_tables}->{table_name}->{types} = [ C<Text::CSV_XS>::IV(), C<Text::CSV_XS>::NV(), C<Text::CSV_XS>::PV() ]; # Note, we assume a certain order of I, N and P! my $sth = $dbh->prepare("SELECT * FROM foo");
Similar to numeric values, DBD::CSV accepts more data types in CREATE TABLE statements than it really supports. You can use CHAR(n) and VARCHAR(n) with arbitrary numbers n, BLOB, or TEXT, but in fact these are always BLOBs, in a loose kind of way.
The one underlying string type can store any binary data including embedded NUL characters. However, many other CSV tools may choke if given such data.
No date or time types are directly supported.
BLOBs are equivalent to strings. They are only limited in size by available memory.
The type_info_all() method is supported and returns the types VARCHAR, CHAR, INTEGER, REAL, BLOB and TEXT.
type_info_all()
The driver doesn't support transactions.
No explicit locks are supported. Tables are locked while statements are executed, but the lock is immediately released once the statement is completed.
You can only retrieve table data. It is not possible to select constants.
Table joins are not supported.
Table and column names are case sensitive. However, you should consider that table names are in fact file names, so tables Foo and foo may both be present with the same data. However, they may be subject to different metadata definitions in $dbh->{'csv_tables'}.
$dbh->{'csv_tables'}
See "Table Metadata" for more details on table and column names.
Two different LIKE operators are supported. LIKE is case sensitive, whereas CLIKE is not.
Row IDs are not supported.
Neither automatic keys nor sequences are supported.
Neither automatic row numbering nor row count limitations are supported.
Question marks are supported as placeholders, as in:
$dbh->do("INSERT INTO A VALUES (?, ?)", undef, $id, $name);
The :1 placeholder style is not supported.
:1
Stored procedures are not supported.
By default the driver expects the column names being stored in the tables first row, as in:
login:password:uid:gid:comment:shell:homedir root:s34hj34n34jh:0:0:Superuser:/bin/bash:/root
If column names are not present, you may specifiy column names via:
$dbh->{csv_tables}->{$table}->{skip_rows} = 0; $dbh->{csv_tables}->{$table}->{col_names} = [qw(login password uid gid comment shell homedir)];
in which case the first row is handled as a data row.
If column names are not supplied and not read from the first row, the names col0, col1, ... are generated automatically. Column names can be retrieved via the standard $sth->{NAME} attribute.
$sth->{NAME}
The NULLABLE attribute returns an array of all ones. Other metadata attributes are not supported.
The table names, or file names, can be read via $dbh->table_info() or $dbh->tables() as usual.
$dbh->table_info()
$dbh->tables()
Besides the attributes f_dir, csv_eol, csv_sep_char, csv_quote_char and csv_sep_char that have already been discussed above, the most important database handle attribute is:
$dbh->{csv_tables}
csv_tables is used for specifying table metadata. It is an hash ref with table names as keys, the values being hash refs with the following attributes:
csv_tables
The file name being associated to the table. By default, the file name is $dbh->{f_dir}/$table.
$dbh->{f_dir}/$table
An array ref of column names.
This number of rows will be read from the top of the file before reading the table data, and the first of those will be treated as an array of column names. However, the col_names attribute takes precedence.
This is an array ref of the Text::CSV_XS type values for the corresponding columns. Three types are supported and their values are defined by the IV(), NV(), and PV() functions in the Text::CSV_XS package.
IV()
NV()
PV()
There are no driver specific statement handle attributes and no private methods for either type of handle.
Positioned updates and deletes are not supported.
The statement handle attributes PRECISION, SCALE, and TYPE are not supported.
Also note that many statement attributes cease to be available after fetching all the result rows or calling the finish() method.
finish()
http://www.whatis.com/csvfile.htm
The number of database handles is limited by memory only. It is recommended to use multiple database handles for different table formats.
There are no limitations on the use of multiple statement handles from the same $dbh.
$dbh
The driver is believed to be completely thread safe.
To install dbd-csv, copy and paste the appropriate command in to your terminal.
cpanm
cpanm dbd-csv
CPAN shell
perl -MCPAN -e shell install dbd-csv
For more information on module installation, please visit the detailed CPAN module installation guide.