Bio::Polloc::GroupCriteria - Rules to group loci
Takes loci and returns groups of loci based on certain rules.
If created via .bme (.cfg) files,
it is defined in the
[ RuleGroup ] and
[ GroupExtension ] namespaces.
Email lmrodriguezr at gmail dot com
This package is licensed under the Artistic License - see LICENSE.txt
Methods provided by the package
Bio::Polloc::Polloc::Error if unexpected input or undefined condition, source or target
The stored loci can also be obtained with
but this function ensures a consistent order in the loci for its evaluation.
The index (int, mandatory).
A Bio::Polloc::LocusI object or undef.
This is a lazzy method,
and should be used ONLY after
get_loci() were called at least once.
the order might not be the expected,
and weird results would appear.
hash or string with
-key => value pairs.
Supported values are:
Searches the flanking regions in the target sequence.
Extension in number of residues upstream the feature.
Extension in number of residues downstream the feature.
Should I detect the proper strand? Otherwise, the stored strand is trusted. This is useful for non-directed features like repeats, which context is actually directed.
Include all detected features (even these overlapping with input features).
Should I include the feature region in the search? 0 by default.
Number of Standar Deviations (SD) tolerated as half of the range of lengths for a feature.
The average (Avg) and the standard deviation of the length are calculated based on all the stored features,
and the Avg+(SD*lensd) is considered as the largest possible new feature.
No minimum length constraint is given,
unless explicitly set with -minlen.
This argument is ignored if
-maxlen is explicitly set.
Default is 1.5.
Maximum length of a new feature in number of residues.
If zero (0) evaluates
Default is 0.
Minimum length of a new feature in number of residues. Default is 0.
Minimum fraction of similarity to include a found region. 0.8 by default.
Should I consider features with only one of the sides? Takes effect only if both -upstream and -downstream are defined. 0 by default.
Minimum score for either algorithms blast and hmmer. 20 by default.
Minimum percentage a residue must appear in order to include it in the consensus used as query. 60 by default. Only if -algorithm blast.
0.1 by default.
program used (
blastn by default.
Bio::Polloc::Polloc::Error if unexpected input,
A Bio::Polloc::LociGroup object containing the updated group, i.e. the original group PLUS the extended features.
Bio::Polloc::Polloc::Error if unexpected input or weird extension definition.
If true, calculates the complete matrix instead of only the bottom-left triangle.
A reference to a boolean 2-dimensional array (only left-down triangle)
WARNING! The order of the output is not allways the same of the input.
as source features MUST be after target features in the array.
it is not possible to have the full picture without building the full matrix (instead of half).
A matrix as returned by Bio::Polloc::GroupCriteria->build_bin
A 2-D arrayref.
This method is intended to build groups providing information on all-vs-all comparisons. If you do not need this information, use the much more efficient Bio::Polloc::GroupCriteria->build_groups method, that relies on transitive property of groups to avoid unnecessary comparisons. Please note that this function also relies on transitivity, but gives you the option to examine all the paired comparisons and even write your own grouping function.
attempts to distribute the work among the specified number of cores.
Warning: This parameter is experimental,
and relies on
It can be used in production with certain confidence,
but it is highly probable to NOT work in parallel (to avoid errors,
this method ignores the command at ANY possible error).
Unimplemented: This argument is currently ignored. Some algorithmic considerations must be addressed before using it. TODO.
A reference to a function to call at every new pair. The function is called with three arguments, the first is the index of the first locus, the second is the index of the second locus and the third is the total number of loci. Note that this function is called BEFORE running the comparison.
An arrayref of Bio::Polloc::LociGroup objects, each containing one consistent group of loci.
This method is faster than combining
and it should be used whenever transitivity can be freely assumed and you do not need the all-vs-all matrix for further evaluation (for example,
locigroup()->genomes(), but is read-only.
Methods intended to be used only within the scope of Bio::Polloc::*
extendfunction has been called.
All the following arguments are mandatory and must be passed in that order. The strand will be determined by the relative position of from/to:
A Bio::Seq object.
This method should be located at a higher hierarchy module (Root?).
This method is static.
A Bio::SimpleAlign object
A 2D arrayref,
where first key is an incremental and second key preserves the orrder in the structure:
2D matrix of integers (arrayref)
2D matrix of Bio::Polloc::LocusI objects (ref)