The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Bio::SAGE::DataProcessing::Filter - An abstract filter for determining whether a [di]tag is worth keeping.

SYNOPSIS

  use Bio::SAGE::DataProcessing::Filter;
  $filter = Bio::SAGE::DataProcessing::Filter->new();

DESCRIPTION

This module encapsulates an abstract filtering procedure that is used during library processing with Bio::SAGE::DataProcessing. For example, a concrete implementation might indicate a tag is not worth keeping because the Phred scores are too low.

INSTALLATION

Included with Bio::SAGE::DataProcessing.

PREREQUISITES

This module requires the Bio::SAGE::DataProcessing package.

CHANGES

  1.10 2004.06.19 - Initial release.
  0.01 2004.05.02 - prototype

VARIABLES

Globals

    $PROTOCOL_SAGE

      Hashref containing protocol parameters for the
      regular/original SAGE protocol (see set_protocol
      documentation for more information).

    $PROTOCOL_LONGSAGE

      Hashref containing protocol parameters for the
      LongSAGE protocol (see set_protocol documentation
      for more information).

Settings

    $DEBUG = 0

      Prints debugging output if value if >= 1.

CLASS METHODS

new [$arg1,$arg2,...]

Constructor for a new Bio::SAGE::DataProcessing::Filter object.

Arguments

$arg1,$arg2,... (optional)

  Any arguments can be specified.  These are stored in
  the 'args' hash element (ie. $self->{'args'}).  Concrete
  subclasses must call this constructor explictly from
  within their constructor.

    i.e. $class->SUPER::new( @_ );

  The required parameters are dependent on the
  concrete implementation of a Filter.

Usage

  Not explicitly called.

INSTANCE METHODS

is_valid $sequence, <\@scores>

This method must be implementated by the developer in a concrete subclass. The contract of this method is to return a boolean value indicating whether the tag is valid or not.

The subclass implementation should always work for cases where the \@scores argument is not provided (i.e. !defined(\@scores)).

Arguments

$sequence

  The tag sequence.

\@scores (optional)

  An arrayref to scores for this tag (it should be
  assumed that the quality scores for the leading
  anchoring enzyme site nucleotides are included).

Usage

  my $filter = Bio::SAGE::DataProcessing::Filter->new();
  if( $filter->is_tag_valid( "AAAAAAAAAA" ) ) {
      print "VALID!\n";
  }

compare $scores1, $scores2

This method determines which set of scores is "better" (defined by the implementation).

This method can be overridden by the developer in a subclass. The default method chooses the scores that have the highest cumulative sum.

Arguments

$scores1,$scores2

  Space-separated strings of Phred scores (for example,
  "20 20 25 12 35").

Returns

  Returns <0 if the first scores are best, >0 if
  the second scores are best, and 0 if the two
  score sets are equivalent.

Usage

  my $filter = Bio::SAGE::DataProcessing::Filter->new();
  my $res = $filter->compare( "20 20 20", "40 40 40" );
  if( $res == -1 ) { # this would be the result in this example
    print "First set is better.\n";
  }
  if( $res == +1 ) {
    print "Second set is better.\n";
  }
  if( $res == 0 ) {
    print "Both sets are equivalent.\n";
  }

COPYRIGHT

Copyright(c)2004 Scott Zuyderduyn <scottz@bccrc.ca>. All rights reserved.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

AUTHOR

Scott Zuyderduyn <scottz@bccrc.ca> BC Cancer Research Centre

VERSION

  1.20

SEE ALSO

  Bio::SAGE::DataProcessing(1).

TODO

  Nothing yet.