The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.
=head1 GenBank HOWTO 

This is a quick synopsis of the steps needed to initialize a GBrowse
database from a genbank record.  For the purposes of illustration, we
will use the RefSeq record for M. bovis, accession NC_002945.

=head1 Using the GBrowse in-memory database 

=head2 1. Convert from Genbank format into GFF format

First download the Genbank record. Then you can create a GFF version
of the file easily using the bp_genbank2gff3.pl script, which is part of
bioperl:

   bp_genbank2gff3.pl NC_002945

This command will create a file called NC_002945.gff.

The newly-converted file will be in GFF3 format, which combines feature data
with sequence/DNA data.  This means that you do not need a separate
FASTA file for the sequence.

=head2 2. Install the GFF file into the databases directory

Copy this file into your in-memory GFF databases directory, as
described in the tutorial.  We will assume
/usr/local/apache/htdocs/gbrowse/databases.

  mkdir /usr/local/apache/htdocs/gbrowse/databases/mbovis
  chmod o+rwx /usr/local/apache/htdocs/gbrowse/databases/mbovis
  cp NC_002945.gff /usr/local/apache/htdocs/gbrowse/databases/mbovis

=head2 3. Set up the configuration file

Use the configuration file 08.genbank.conf as your starting template.
This is located in contrib/conf_files:

  cp contrib/conf_files/08.genbank.conf /usr/local/apache/conf/gbrowse.conf/mb.conf

=head2 4. Edit the configuration file as appropriate

You will need to change the [GENERAL] section to use the in-memory
adaptor and to point to the location of the M. bovis GFF file:

 [GENERAL]
 description   = Mycobacterium Bovis In-Memory
 db_adaptor    = Bio::DB::GFF
 db_args       = -adaptor memory
	              -dir /usr/local/apache/htdocs/gbrowse/databases/mbovis

You might also want to change the "examples" tag to introduce the
accession number for the whole genome, and a few choice gene names and
search terms:

  examples = NC_002945 Mb1800 galT glucose

That is all there is to it, but since this is a pretty big chunk of DNA
(> 4 Mbp), it uses a considerable amount of memory and performance
will be sluggish unless you have a fast machine with lots of memory.
So you might wish to view it using a MySQL, PostgreSQL or Oracle
database.  The following are instructions for doing this.

=head1 Using the GBrowse in-memory database

We will assume that you are using a MySQL database.

=head2 1. Create the database

Create the database using mysqladmin:

  mysqladmin create mbovis

As described in the tutorial, give yourself write permission for the
database, and give the web server user (e.g. "nobody") select
permission.

=head2 2. Load the GFF3 into the database

You can load the GFF3 into your Mysql database using the bp_bulk_load_gff.pl
script from Bioperl:

 bp_bulk_load_gff.pl -d mbovis NC_002945.gff

=head2 3. Set up the configuration file

Use the configuration file 08.genbank.conf as your starting template.
This is located in contrib/conf_files:

  cp contrib/conf_files/08.genbank.conf /usr/local/apache/conf/gbrowse.conf/mb.conf

=head2 4. Edit the configuration file as appropriate

You will need to change the [GENERAL] section to use the appropriate
database adaptor:

 [GENERAL]
 description   = Mycobacterium Bovis Database
 db_adaptor    = Bio::DB::GFF
 db_args       = -adaptor dbi::mysql
	              -dsn     dbi:mysql:database=mbovis;host=localhost
                 -user    nobody
		           -passwd  ""

You might also want to change the "examples" tag to introduce the
accession number for the whole genome, and a few choice gene names and
search terms:

  examples = NC_002945 Mb1800 galT glucose

That should be it!

=head2 NOTE

You can load as many accessions into the database as you like.
Each one will appear as a "chromosome" named after the accession
number of the entry.