The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Bio::Grep::Backend::RE - Perl Regular Expression back-end

SYNOPSIS

  use Bio::Grep;
  
  my $sbe = Bio::Grep->new('RE');
  
  $sbe->settings->datapath('data');
  
  # generate a database. you have to do this only once. 
  $sbe->generate_database({ 
    file        => 'ATH1.cdna', 
    description => 'AGI Transcripts',
    datapath    => 'data',
  });
  
  # search on both strands  
  # retrieve up- and downstream regions of size 30
  
  $sbe->search({
    query   => 'GAGCCCTT',
    direct_and_rev_com => 1, 
    upstream           => 30,
    downstream         => 30,
    database           => 'ATH1.cdna',
  });
  
  my @internal_ids;
  
  # output the searchresults with nice alignments
  while ( my $res = $sbe->next_res) {
     print $res->sequence->id . "\n";
     print $res->mark_subject_uppercase() . "\n";
     print $res->alignment_string() . "\n\n";
     push @internal_ids, $res->sequence_id;
  }
  
  # get the complete sequences as Bio::SeqIO object
  my $seq_io = $sbe->get_sequences(\@internal_ids);

  # sequences with at least 10 As
  $sbe->search({ query => '[A]{10,}' });
 
  # some SNPs
  $sbe->search({query => '[CG]TGC[AT]CTCTTCT[CG]TCA'});

DESCRIPTION

Bio::Grep::Backend::RE searches for a query with a Perl Regular Expression.

Internally, it pre-compiles the specified regex (with the appended modifiers i, m, s and x), matches it against every line in the database with the looping modifier g, and then returns the positions retrieved with $- and $+. The substr function is then used to extract the sequences.

This back-end does not perform any sanity checks of the regular expressions, so do NOT provide this back-end in a web service.

METHODS

See Bio::Grep::Backend::BackendI for inherited methods.

CONSTRUCTOR

Bio::Grep::Backend::RE->new()

This method constructs a RE back-end object and should not used directly. Rather, a back-end should be constructed by the main class Bio::Grep:

  my $sbe = Bio::Grep->new('RE');

PACKAGE METHODS

$sbe->available_sort_modes()

Returns all available sort modes as hash. keys are sort modes, values a short description.

   $sbe->sort('ga');

Available sort modes in RE:

                ga  : 'ascending order of dG'
                gd  : 'descending order of dG'

Note that 'ga' and 'gd' require that search results have dG set. Bio::Grep::RNA ships with filters for free energy calculation. Also note that these two sort options require that we load all results in memory.

IMPORTANT NOTES

Code Quality

BETA RELEASE!

reverse_complement

reverse_complement (and direct_and_rev_com ) are supported, but are only available for DNA/RNA queries, not for regular expressions.

Regular Expression modifiers

The i,m,s and x modifiers are added to the regex.

RNA

Be careful with RNA sequences: U is not the same as T in this back-end!

maxhits

When maxhits is defined, the sliding window stops when maxhits hits were found.

Database

Bio::Grep::Backend::RE databases are compatible with Bio::Grep::Backend::Agrep databases.

DIAGNOSTICS

See Bio::Grep::Backend::BackendI for other diagnostics.

While doing a reverse-complement the query does not look like a...

Either reverse_complement or direct_and_rev_com is set and the query does not match the regular expression m{\A [gactu]+ \z}xmsi. Bio::Root::BadParameter.

Query not defined.

You forgot to define $sbe->settings->query. Bio::Root::BadParameter.

SEE ALSO

Bio::Grep::Backend::BackendI Bio::Grep::SearchSettings Bio::SeqIO Bio::Index::Fasta

AUTHOR

Markus Riester, <mriester@gmx.de>

LICENSE AND COPYRIGHT

Copyright (C) 2007-2009 by M. Riester.

This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

DISCLAIMER OF WARRANTY

BECAUSE THIS SOFTWARE IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY FOR THE SOFTWARE, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE SOFTWARE "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE SOFTWARE IS WITH YOU. SHOULD THE SOFTWARE PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR, OR CORRECTION.

IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR REDISTRIBUTE THE SOFTWARE AS PERMITTED BY THE ABOVE LICENSE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE SOFTWARE (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE SOFTWARE TO OPERATE WITH ANY OTHER SOFTWARE), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.