$ perl hivq.PL hivq> query C[subtype] SI[phenotype] hivq> prerun 80 sequences returned Query: C[subtype] SI[phenotype] hivq> outfile csi.fas hivq> run Download complete. hivq> outfile dsi.fas hivq> run D[subtype] SI[phenotype] Download complete. hivq> count 25 sequences returned Query: D[subtype] SI[phenotype] hivq> exit $
The BioPerl modules Bio::DB::HIV and Bio::DB::Query::HIVQuery together allow batch queries against the Los Alamos National Laboratories' HIV Sequence Database using a simple query language.
hivq.PL provides both an example of the use of these modules, and a standalone interactive command-line interface to the LANL HIV DB. Simple commands allow the user to retrieve HIV sequences and annotations using the query language implemented in Bio::DB::Query::HIVQuery. Visit the man pages for those modules for more details.
Run the script using
perl hivq.PL or, in Unix,
./hivq.PL. You will see the
prompt. Type commands with queries to retrieve sequence and annotation data. See the SYNOPSIS for a sample session. Available commands are described below.
The LANL database is pretty complex and extensive. Use the
find facility to explore the available database tables and fields. To identify aliases for a particular field, use
find alias [fieldname]. For example, to find a short alias to the weirdly named field
hivq> find alias seq_sample.ssam_second_receptor
Now, instead of the following query
hivq> run C[subtype] CCR5[seq_sample.ssam_second_receptor]
you know you can do
hivq> run C[subtype] CCR5[coreceptor]
outfile command to set the file that receives the retrieved sequences. You can change the current output file simply by issuing a new
outfile command during the session. The output file defaults to standard output.
query command to validate a query without hitting the database. Use the
count commands to get a count of sequence hits for a query without retrieving the data. Use
do to perform a complete query, retrieving sequence data into the currently set output files.
hivq.PL commands in batch, create a text file (
hivq.cmd, for example) containing desired commands one per line. Then execute the following from the shell:
$ cat hivq.cmd | perl hivq.PL
Here is a complete list of commands. Options in single brackets (
[req_option]) are required; options in double brackets (
[[opt_option]]) are optional.
confirm : Toggle interactive confirmation before executing queries exit : Quit script find : Explore database schema find tables Display all database tables find fields Display all database fields (columns) find fields [table] Display all fields in [table] find alias [field] Display valid aliases for [field] help [[command]] : Show command help if [[command]] not specified, list all available commands id : Display current session id outfile [filename] : Set file for collecting retrieved data ping : Check if LANL DB is available prerun [[query]] : Execute query but retreive hit count only if [[query]] not specified, use current query query [query] : Validate and set current query run [[query]] : Execute query and retrieve data if [[query]] not specified, use current query state : Display current state of the script bye : Alias for 'exit' config : Alias for 'state' count : Alias for 'prerun' do : Alias for 'run' out : Alias for 'outfile' quit : Alias for 'exit'
-v : verbose; turns on the internal debug() function
User feedback is an integral part of the evolution of this and other Bioperl modules. Send your comments and suggestions preferably to the Bioperl mailing list. Your participation is much appreciated.
email@example.com - General discussion http://bioperl.org/wiki/Mailing_lists - About the mailing lists
Report bugs to the Bioperl bug tracking system to help us keep track of the bugs and their resolution. Bug reports can be submitted via the web:
Mark A. Jensen <firstname.lastname@example.org>