Set::IntSpan::Island - extension for Set::IntSpan to handle islands, holes and covers
use Set::IntSpan::Island; # inherits normal behaviour from Set::IntSpan $set = Set::IntSpan::Island->new( $set_spec ); # special pair input creates a span a-b $set = Set::IntSpan::Island->new( $a,$b ); # equivalent to $set->cardinality($another_set)->size; if ($set->overlap( $another_set )) { ... } # distance between spans is negative if spans overlap, positive if not $distance = $set->distance( $another_set ); # remove islands whose size is smaller than $minsize $new_set = $set->excise( $minsize ); # remove islands whose size is found in the set $sizes_set, $new_set = $set->excise( $sizes_set ); # all islands sized <= 10 removed $new_set = $set->excise( Set::IntSpan( "(-10" ) ); # all islands sized >= 10 removed $new_set = $set->excise( Set::IntSpan( "10-)" ) ); # all islands of size between 2-5 removed $new_set = $set->excise( Set::IntSpan( "2-5" ) ); # remove islands larger than $maxlength $set = $set->excise_large( $minlength ); # fill holes up to $maxsize $set = $set->fill( $maxsize ); # fill holes whose size is found in the set $sizes_set $set = $set->fill( $sizes_set); # all holes sizes <= 10 filled $set = $set->fill( Set::IntSpan( "(-10" ) ); # all holes sizes >= 10 filled $set = $set->fill( Set::IntSpan( "10-)" ) ); # all holes sizes 2-5 filled $set = $set->fill( Set::IntSpan( "2-5" ) ); # return a set composed of islands of $set that overlap $another_set $set = $set->find_island( $another_set ); # return a set composed of the nearest non-overlapping island(s) to $another_set $set = $set->nearest_island( $another_set ); # construct a list of covers by exhaustively intersecting all sets @covers = Set::IntSpan::Island->extract_covers( { id1=>$set1, id2=>set2, ... } ); for $cover (@covers) { ($coverset,@ids) = ($cover->[0], @{$cover->[1]}); print "cover",$coverset->run_list,"contains sets",join(",",@ids); }
This module extends the Set::IntSpan
module by Steve McDougall. It implementing methods that are specific to islands, holes and covers. Set::IntSpan::Island
inherits from Set::IntSpan.
An integer set, as represented by Set::IntSpan
, is a collection of islands (or spans) on the number line
...-----xxxx----xxxxxxxx---xxxxxxxx---xx---x----....
Holes are regions not in the set that fall between adjacent spans. For example, the integer set above is composed of 5 islands and 4 holes. The two infinite regions on either side of the set are not counted as holes within the context of this module.
Constructs a set using the set specification as supported by Set::IntSpan
.
Extension to Set::IntSpan
new
method, this double-argument version creates a set formed by the range a-b. This is equivalent to
$set = Set::IntSpan::Island->new("$a-$b")
but permits initialization from a list instead of a string. The arguments $a and $b are expected to be integers - any decimal component will be truncated.
new(1.2,2.9) equivalent to new(1,2)
Creates a copy of $set
. Also accessible using $set-
duplicate()>;
Same as clone()
.
Returns the size of intersection of two sets. Equivalent to
$set->intersect( $another_set )->size;
The returned value is either 0 (if the sets do not overlap) or positive (if they do).
Returns the distance between sets, measured as follows. If the sets overlap, then the distance is negative and given by
$d = -$set->overlap( $another_set )
If the sets abut, $d
is 1. Here $d can be interpreted as the difference between the closest edges of the two sets.
The above generalizes to 1+size(hole) if the sets do not overlap and are composed of multiple islands. The hole used is the one between two closest islands of the sets.
Returns undef
if $another_set
is not defined, or either $set
or $another_set
is empty.
Here are some examples of how the distance is calculated.
A ----xxxx---xxx-----xx-- B ------xxx------xx--x--- !! ! d=-3 A ----xxxx---xxx-----xx-- B ----xxxx---xxx--------- !!!! !!! d=-7 A ----xxxx---xxx-----xx-- B --------------x-------- >< d=1 A ----xxxx---xxx-----xx-- B ---------------x------- > < d=2 A ----xxxx---xxx-----xx-- B ---------------xx------ > < d=2 A ----xxxx---xxx-----xx-- B ---------------xxxx---- >< d=1
Returns all spans in $set as Set::IntSpan::Island
objects. This method overrides the sets
method in Set::IntSpan
in order to return sets as Set::IntSpan::Island objects.
Removes all islands smaller than $minlength
. If $minlength
< 1 then no elements are removed and a copy of the set is returned. Since only islands smaller than $minlength
are removed, the smallest useful value for $minlength
is 2.
If passed a set $size_set
, removes all islands whose size is found in $size_set
. This extended functionality allows you to pass in arbitrary size cutoffs. For example, to remove islands of size <=10
$new_set = $set->excise( Set::IntSpan->( "(-10" ) )
or to remove islands of size 2-10
$new_set = $set->excise( Set::IntSpan->( "2-10" ) )
Since size of an island must be non-zero and positive, any negative elements in the size set will be ignored. The two are therefore equivalent
$new_set = $set->excise( Set::IntSpan->( "2-10" ) ) $new_set = $set->excise( Set::IntSpan->( "(--1,2-10" ) )
Using a size set allows you to excise islands larger than a certain size. For example, to remove all islands 10 or bigger,
$new_set = $set->excise( Set::IntSpan->( "10-)" ) )
Regardless of input, if all islands are excised (i.e. all elements from $set are removed), this function will return an empty set.
Contrast excise()
to keep()
. Use excise()
when you have a set of island sizes you want to remove. Use keep()
when you have a set of island sizes you want to keep. In other words, these are equivalent:
$set->excise( $size_set ) $set->keep( $size_set->complement )
Strictly speaking, you can pass in any object as a size limiter, as long as it implements a member()
function which returns 1 if the size is in the cutoff set and 0 otherwise.
$filter = Some::Other::Module->new(); # set $filter parameters according to Some::Other::Module API... ... # $filter must implement "member" function $filter->can("member") if($filter->member(10)) { print "islands of size 10 will be removed"; } else { print "islands of size 10 will be kept"; } $set->excise($filter);
If passed an integer $maxlength
, removes all islands larger than $maxlength
.
If passed a set $size_set
, removes all islands whose size is not found in $size_set
. For example, to keep all islands sized 10 or larger,
$new_set = $set->keep( Set::IntSpan->( "10-)" ) )
or keep all islands sized 2-10
$new_set = $set->excise( Set::IntSpan->( "2-10" ) )
Returns an empty set if no islands are kept.
Since size of an island must be non-zero and positive, any negative elements in the size set will be ignored. The two are therefore equivalent
$new_set = $set->keep( Set::IntSpan->( "2-10" ) ) $new_set = $set->keep( Set::IntSpan->( "(--1,2-10" ) )
Contrast keep()
to excise()
. Use keep()
when you have a set of island sizes you want to keep. Use excise()
when you have a set of island sizes you want to remove. In other words, these are equivalent:
$set->keep( $size_set ) $set->excise( $size_set->complement )
Strictly speaking, you can pass in any object as a size limiter, as long as it implements a member()
function which returns 1 if the size is in the cutoff set and 0 otherwise. See the description of excise()
for details.
If passed an integer $maxsize
, fills in all holes in $set smaller than $maxsize
.
If passed a set $size_set
, fills in all holes whose size appears in $size_set
.
Strictly speaking, you can pass in any object as a size limiter, as long as it implements a member()
function which returns 1 if the size is in the cutoff set and 0 otherwise. See the description of excise()
for details.
Returns a set composed of islands from $set that overlap with $integer
or $another_set
.
If an integer is passed and $integer
is not in $set
, an empty set is returned.
If a set is passed and $set
and $another_set
have an empty intersection, an empty set is returned.
set ----xxxx---xxx-----xx-- another_set ------------x---------- island_set -----------xxx--------- set ----xxxx---xxx-----xx-- another_set ------------xxxxx------ island_set -----------xxx--------- set ----xxxx---xxx-----xx-- another_set ------------xxxxx---xx- island_set -----------xxx-----xx--
Contrast this to nearest_island() which returns the closest island(s) that do not overlap with $integer
or $another_set
.
Returns the island(s) in $set
closest (but not overlapping) to $integer
or $another_set
. If $integer
or $another_set
lie exactly between two islands, then the returned set contains these two islands.
If no non-overlapping islands in $set are found, an empty set is returned.
set ----xxxx---xxx-----xx-- another_set ------------x---------- island_set ----xxxx--------------- set ----xxxx---xxx-----xx-- another_set ------------xxxxx------ island_set -------------------xx-- set ----xxxx---xxx-----xx-- another_set ----------xxxxxxx------ island_set ----xxxx-----------xx--
If $another_set contains multiple islands, such as below, $island_set may also contain multiple islands.
set ----xxxx---xxx-----xx-- another_set ---x----xxx------------ island_set ----xxxx---xxx---------
Contrast this to find_islands()
which returns the island(s) that overlap with $integer
or $another_set
.
Returns the number of islands in the set. If the set is empty, 0 is returned.
Returns the island indexed by $island_index. Islands are 0-indexed. For a set with N islands, the first island (ordered left-to-right) has index 0 and the last island has index N-1.
If $island_index is negative, counting is done back from the last island.
If $island_index is beyond the last island, undef is returned.
Returns the first island of the set. As a side-effect, sets the iterator to the first island.
If the set is empty, returns undef.
Returns the last island of the set. As a side-effect, sets the iterator to the last island.
If the set is empty, returns undef.
Advances the iterator forward by one island, and returns the next island. If the iterator is undefined, the first island is returned.
Returns undef if the set is empty or if no more islands are available.
Reverses the iterator backward by one island, and returns the previous island. If the iterator is undefined, the last island is returned.
Returns undef if the set is empty or if no more islands are available.
Returns the island at the current iterator position.
Returns undef if the set is empty or if the iterator is not defined.
Given a $set_hash
reference
{ id1=>$set1, id2=>$set2, ..., idn=>$setn}
where $setj
is a finite Set::IntSpan::Island
object and idN
is a unique key, extract_covers
performs an exhaustive intersection of all sets and returns a list of all covers and set memberships. For example, given the id/runlist combination
a 10-15 b 12 c 14-20 d 25
The covers are
10-11 a 12 a b 13 a 14-15 a c 16-20 c 21-24 - 25 d
The cover data is returned as an array reference and its structure is
[ [ $cover_set1, [ id11, id12, id13, ... ] ], [ $cover_set2, [ id21, id22, id23, ... ] ], ... ]
If a cover contains no elements, then its entry is
[ $cover_set, [ ] ]
Martin Krzywinski <martink@bcgsc.ca>
Set::IntSpan
)Now inherits from Set::IntSpan vis parent
pragma.
Added clone() as an alias to duplicate().
On error, now confess
is used instead of croak
.
Expanded testing, now with Test::More.
Minor style adjustments in documentation.
Minor cosmetic fixes.
Modified excise(), distance() and fill(). Added keep().
More comprehensive extract_cover testing after bug in v0.01 was reported.
Added island iterator.
Release.
Set::IntSpan
by Steven McDougall
Copyright (c) 2007 by Martin Krzywinski. This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself.