The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.
NAME
    Bio::DB::Das::Chado - DAS-style access to a chado database

SYNOPSIS
      # Open up a feature database
                     $db    = Bio::DB::Das::Chado->new(
                                -dsn  => 'dbi:Pg:dbname=gadfly;host=lajolla'
                                -user => 'jimbo',
                                -pass => 'supersecret',
                                           );

      @segments = $db->segment(-name  => '2L',
                               -start => 1,
                               -end   => 1000000);

      # segments are Bio::Das::SegmentI - compliant objects

      # fetch a list of features
      @features = $db->features(-type=>['type1','type2','type3']);

      # invoke a callback over features
      $db->features(-type=>['type1','type2','type3'],
                    -callback => sub { ... }
                    );

      # get all feature types
      @types   = $db->types;

      # count types
      %types   = $db->types(-enumerate=>1);

      @feature = $db->get_feature_by_name($class=>$name);
      @feature = $db->get_feature_by_target($target_name);
      @feature = $db->get_feature_by_attribute($att1=>$value1,$att2=>$value2);
      $feature = $db->get_feature_by_id($id);

      $error = $db->error;

DESCRIPTION
    Chado is the GMOD database schema, and chado is a specific instance of
    it. It is still somewhat of a moving target, so this package will
    probably require several updates over the coming months to keep it
    working.

FEEDBACK
  Mailing Lists
    User feedback is an integral part of the evolution of this and other
    GMOD modules. Send your comments and suggestions preferably to one of
    the GMOD mailing lists. Your participation is much appreciated.

      gmod-gbrowse@lists.sourceforge.com

  Reporting Bugs
    Report bugs to the GMOD bug tracking system at SourceForge to help us
    keep track the bugs and their resolution.

      http://sourceforge.net/tracker/?group_id=27707&atid=391291

AUTHOR - Scott Cain
    Email scain@cpan.org

LICENSE
    This software may be redistributed under the same license as perl.

APPENDIX
    The rest of the documentation details each of the object methods.
    Internal methods are usually preceded with a _

  new
     Title   : new
     Usage   : $db    = Bio::DB::Das::Chado(
                                -dsn  => 'dbi:Pg:dbname=gadfly;host=lajolla'
                                -user => 'jimbo',
                                -pass => 'supersecret',
                                           );

     Function: Open up a Bio::DB::DasI interface to a Chado database
     Returns : a new Bio::DB::Das::Chado object
     Args    :

  use_all_feature_names
      Title   : use_all_feature_names
      Usage   : $obj->use_all_feature_names()
      Function: set or return flag indicating that all_feature_names view is present
      Returns : 1 if all_feature_names present, 0 if not
      Args    : to return the flag, none; to set, 1

  organism_id
      Title   : organism_id
      Usage   : $obj->organism_id()
      Function: set or return the organism_id
      Returns : the value of the id
      Args    : to return the flag, none; to set, the common name of the organism

    If -organism is set when the Chado feature is instantiated, this method
    queries the database with the common name to cache the organism_id.

  inferCDS
      Title   : inferCDS
      Usage   : $obj->inferCDS()
      Function: set or return the inferCDS flag
      Returns : the value of the inferCDS flag
      Args    : to return the flag, none; to set, 1

    Often, chado databases will be populated without CDS features, since
    they can be inferred from a union of exons and polypeptide features.
    Setting this flag tells the adaptor to do the inferrence to get those
    derived CDS features (at some small performance penatly).

  allow_obsolete
      Title   : allow_obsolete
      Usage   : $obj->allow_obsolete()
      Function: set or return the allow_obsolete flag
      Returns : the value of the allow_obsolete flag
      Args    : to return the flag, none; to set, 1

    The chado feature table has a flag column called 'is_obsolete'.
    Normally, these features should be ignored by GBrowse, but the
    -allow_obsolete method is provided to allow displaying obsolete
    features.

  sofa_id
      Title   : sofa_id 
      Usage   : $obj->sofa_id()
      Function: get or return the ID to use for SO terms
      Returns : the cv.cv_id for the SO ontology to use
      Args    : to return the id, none; to determine the id, 1

  recursivMapping
      Title   : recursivMapping
      Usage   : $obj->recursivMapping($newval)
      Function: Flag for activating the recursive mapping (desactivated by default)
      Returns : value of recursivMapping (a scalar)
      Args    : on set, new value (a scalar or undef, optional)

      Goal : When we have a clone mapped on a chromosome, the recursive mapping maps the features of the clone on the chromosome.

  srcfeatureslice
      Title   : srcfeatureslice
      Usage   : $obj->srcfeatureslice
      Function: Flag for activating 
      Returns : value of srcfeatureslice
      Args    : on set, new value (a scalar or undef, optional)
      Desc    : Allows to use a featureslice of type featureloc_slice(srcfeat_id, int, int)
      Important : this and recursivMapping are mutually exclusives

  do2Level
      Title   : do2Level
      Usage   : $obj->do2Level
      Function: Flag for activating the fetching of 2levels in segment->features
      Returns : value of do2Level
      Args    : on set, new value (a scalar or undef, optional)

  dbh
      Title   : dbh
      Usage   : $obj->dbh($newval)
      Function:
      Returns : value of dbh (a scalar)
      Args    : on set, new value (a scalar or undef, optional)

  term2name
      Title   : term2name
      Usage   : $obj->term2name($newval)
      Function: When called with a hashref, sets cvterm.cvterm_id to cvterm.name 
                mapping hashref; when called with an int, returns the name
                corresponding to that cvterm_id; called with no arguments, returns
                the hashref.
      Returns : see above
      Args    : on set, a hashref; to retrieve a name, an int; to retrieve the
                hashref, none.

    Note: should be replaced by Bio::GMOD::Util->term2name

  name2term
      Title   : name2term
      Usage   : $obj->name2term($newval)
      Function: When called with a hashref, sets cvterm.name to cvterm.cvterm_id
                mapping hashref; when called with a string, returns the cvterm_id
                corresponding to that name; called with no arguments, returns
                the hashref.
      Returns : see above
      Args    : on set, a hashref; to retrieve a cvterm_id, a string; to retrieve
                the hashref, none.

    Note: Should be replaced by Bio::GMOD::Util->name2term

  segment
     Title   : segment
     Usage   : $db->segment(@args);
     Function: create a segment object
     Returns : segment object(s)
     Args    : see below

    This method generates a Bio::Das::SegmentI object (see
    Bio::Das::SegmentI). The segment can be used to find overlapping
    features and the raw sequence.

    When making the segment() call, you specify the ID of a sequence
    landmark (e.g. an accession number, a clone or contig), and a positional
    range relative to the landmark. If no range is specified, then the
    entire region spanned by the landmark is used to generate the segment.

    Arguments are -option=>value pairs as follows:

     -name         ID of the landmark sequence.

     -class        A namespace qualifier.  It is not necessary for the
                   database to honor namespace qualifiers, but if it
                   does, this is where the qualifier is indicated.

     -version      Version number of the landmark.  It is not necessary for
                   the database to honor versions, but if it does, this is
                   where the version is indicated.

     -start        Start of the segment relative to landmark.  Positions
                   follow standard 1-based sequence rules.  If not specified,
                   defaults to the beginning of the landmark.

     -end          End of the segment relative to the landmark.  If not specified,
                   defaults to the end of the landmark.

    The return value is a list of Bio::Das::SegmentI objects. If the method
    is called in a scalar context and there are no more than one segments
    that satisfy the request, then it is allowed to return the segment.
    Otherwise, the method must throw a "multiple segment exception".

  features
     Title   : features
     Usage   : $db->features(@args)
     Function: get all features, possibly filtered by type
     Returns : a list of Bio::SeqFeatureI objects
     Args    : see below
     Status  : public

    This routine will retrieve features in the database regardless of
    position. It can be used to return all features, or a subset based on
    their type

    Arguments are -option=>value pairs as follows:

      -type      List of feature types to return.  Argument is an array
                 of Bio::Das::FeatureTypeI objects or a set of strings
                 that can be converted into FeatureTypeI objects.

      -callback   A callback to invoke on each feature.  The subroutine
                  will be passed each Bio::SeqFeatureI object in turn.

      -attributes A hash reference containing attributes to match.

    The -attributes argument is a hashref containing one or more attributes
    to match against:

      -attributes => { Gene => 'abc-1',
                       Note => 'confirmed' }

    Attribute matching is simple exact string matching, and multiple
    attributes are ANDed together.

    If one provides a callback, it will be invoked on each feature in turn.
    If the callback returns a false value, iteration will be interrupted.
    When a callback is provided, the method returns undef.

  types
     Title   : types
     Usage   : $db->types(@args)
     Function: return list of feature types in database
     Returns : a list of Bio::Das::FeatureTypeI objects
     Args    : see below

    This routine returns a list of feature types known to the database. It
    is also possible to find out how many times each feature occurs.

    Arguments are -option=>value pairs as follows:

      -enumerate  if true, count the features

    The returned value will be a list of Bio::Das::FeatureTypeI objects (see
    Bio::Das::FeatureTypeI.

    If -enumerate is true, then the function returns a hash (not a hash
    reference) in which the keys are the stringified versions of
    Bio::Das::FeatureTypeI and the values are the number of times each
    feature appears in the database.

    NOTE: This currently raises a "not-implemented" exception, as the BioSQL
    API does not appear to provide this functionality.

  get_feature_by_alias, get_features_by_alias
     Title   : get_features_by_alias
     Usage   : $db->get_feature_by_alias(@args)
     Function: return list of feature whose name or synonyms match
     Returns : a list of Bio::Das::Chado::Segment::Feature objects
     Args    : See below

    This method finds features matching the criteria outlined by the
    supplied arguments. Wildcards (*) are allowed. Valid arguments are:

    -name
    -class
    -ref (refrence sequence)
    -start
    -end

  get_feature_by_name, get_features_by_name
     Title   : get_features_by_name
     Usage   : $db->get_features_by_name(@args)
     Function: return list of feature whose names match
     Returns : a list of Bio::Das::Chado::Segment::Feature objects
     Args    : See below

    This method finds features matching the criteria outlined by the
    supplied arguments. Wildcards (*) are allowed. Valid arguments are:

    -name
    -class
    -ref (refrence sequence)
    -start
    -end

  _by_alias_by_name
     Title   : _by_alias_by_name
     Usage   : $db->_by_alias_by_name(@args)
     Function: return list of feature whose names match
     Returns : a list of Bio::Das::Chado::Segment::Feature objects
     Args    : See below

    A private method that implements the get_features_by_name and
    get_features_by_alias methods. It accepts the same args as those
    methods, plus an addtional on (-operation) which is either 'by_alias' or
    'by_name' to indicate what rule it is to use for finding features.

  srcfeature2name
    returns a srcfeature name given a srcfeature_id

  gff_source_db_id
      Title   : gff_source_db_id
      Function: caches the chado db_id from the chado db table

  gff_source_dbxref_id
    Gets dbxref_id for features that have a gff source associated

  dbxref2source
    returns the source (string) when given a dbxref_id

  source_dbxref_list
     Title   : source_dbxref_list
     Usage   : @all_dbxref_ids = $db->source_dbxref_list()
     Function: Gets a list of all dbxref_ids that are used for GFF sources
     Returns : a comma delimited string that is a list of dbxref_ids
     Args    : none
     Status  : public

    This method queries the database for all dbxref_ids that are used to
    store GFF source terms.

  search_notes
     Title   : search_notes
     Usage   : $db->search_notes($search_term,$max_results)
     Function: full-text search on features, ENSEMBL-style
     Returns : an array of [$name,$description,$score]
     Args    : see below
     Status  : public

    This routine performs a full-text search on feature attributes (which
    attributes depend on implementation) and returns a list of
    [$name,$description,$score], where $name is the feature ID (accession?),
    $description is a human-readable description such as a locus line, and
    $score is the match strength.

  ** NOT YET ACTIVE: search_notes IS IN TESTING STAGE **
    sub search_notes { my $self = shift; my ($search_string,$limit) = @_; my
    $limit_str; if (defined $limit) { $limit_str = " LIMIT $limit "; } else
    { $limit_str = ""; }

    # so here's the plan: # if there is only 1 word, do 1-3 # 1. search for
    accessions like $string.'%'--if any are found, quit and return them # 2.
    search for feature.name like $string.'%'--if found, keep and continue #
    3. search somewhere in analysis like $string.'%'--if found, keep and
    continue # if there is more than one word, don't search accessions # 4.
    search each word anded together like '%'.$string.'%' --if found, keep
    and continue # 5. search somewhere in analysis like '%'.$string.'%'

    # $self->dbh->trace(1);

      my @search_str = split /\s+/, $search_string;
      my $qsearch_term = $self->dbh->quote($search_str[0]);
      my $like_str = "( (dbx.accession ~* $qsearch_term OR \n"
            ."           f.name        ~* $qsearch_term) ";
      for (my $i=1;$i<(scalar @search_str);$i++) {
        $qsearch_term = $self->dbh->quote($search_str[$i]);
        $like_str .= "and \n";
        $like_str .= "          (dbx.accession ~* $qsearch_term OR \n"
                    ."           f.name        ~* $qsearch_term) ";
      } 
      $like_str .= ")";

      my $sth = $self->dbh->prepare("
         select dbx.accession,f.name,0 
         from feature f, dbxref dbx, feature_dbxref fd
         where
            f.feature_id = fd.feature_id and
            fd.dbxref_id = dbx.dbxref_id and 
            $like_str 
         $limit_str
        ");
      $sth->execute or throw ("couldn't execute keyword query");

      my @results;
      while (my ($acc, $name, $score) = $sth->fetchrow_array) {
        $score = sprintf("%.2f",$score);
        push @results, [$acc, $name, $score];
      }
      $sth->finish;
      return @results;
    }

  attributes
     Title   : attributes
     Usage   : @attributes = $db->attributes($id,$name)
     Function: get the "attributes" on a particular feature
     Returns : an array of string
     Args    : feature ID [, attribute name]
     Status  : public

    This method is intended as a "work-alike" to Bio::DB::GFF's attributes
    method, which has the following returns:

    Called in list context, it returns a list. If called in a scalar
    context, it returns the first value of the attribute if an attribute
    name is provided, otherwise it returns a hash reference in which the
    keys are attribute names and the values are anonymous arrays containing
    the values.

  _segclass
     Title   : _segclass
     Usage   : $class = $db->_segclass
     Function: returns the perl class that we use for segment() calls
     Returns : a string containing the segment class
     Args    : none
     Status  : reserved for subclass use

  chado_reference_class
      Title   : chado_reference_class 
      Usage   : $obj->chado_reference_class()
      Function: get or return the ID to use for Gbrowse map reference class 
                using cvtermprop table, value = MAP_REFERENCE_TYPE 
      Returns : the cvterm.name 
      Args    : to return the id, none; to determine the id, 1
      See also: default_class, refclass_feature_id

      Optionally test that user/config supplied ref class is indeed a proper
      chado feature type.
  
  refclass_feature_id
     Title   : refclass_feature_id
     Usage   : $self->refclass_srcfeature_id()
     Function: Used to store the feature_id of the reference class feature we are working on (e.g. contig, supercontig)
               With this feature we can filter out all the request to be sure we are extracting a feature located on 
               the reference class feature.
     Returns : A scalar
     Args    : The feature_id on setting

LEFTOVERS FROM BIO::DB::GFF NEEDED FOR DAS
    these methods should probably be declared in an interface class that
    Bio::DB::GFF implements. for instance, the aggregator methods could be
    described in Bio::SeqFeature::AggregatorI

END LEFTOVERS