The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Set::Scalar - basic set operations

SYNOPSIS

    use Set::Scalar;
    $s = Set::Scalar->new;
    $s->insert('a', 'b');
    $s->delete('b');
    $t = Set::Scalar->new('x', 'y', $z);

DESCRIPTION

Creating

    $s = Set::Scalar->new;
    $s = Set::Scalar->new(@members); 

    $t = $s->clone;
    $t = $s->copy; # clone of clone

Modifying

    $s->insert(@members);
    $s->delete(@members);
    $s->invert(@members); # insert if hasn't, delete if has

    $s->clear; # removes all the elements

Note that clear() only releases the memory used by the set to be reused by Perl; it will not reduce the overall memory use.

Displaying

    print $s, "\n";

The display format of a set is the members of the set separated by spaces and enclosed in parentheses ().

You can even display recursive sets.

See "Customising Display" for customising the set display.

Querying

    @members  = $s->members;
    @elements = $s->elements; # alias for members

    $size = $s->size; # the number of members

    $s->has($m)       # return true if has that member
    $s->contains($m)  # alias for has

    if ($s->has($member)) { ... }

    $s->member($m)    # returns the member if has that member
    $s->element($m)   # alias for member

    $s->is_null       # returns true if the set is empty
    $s->is_empty      # alias for is_null
    $s->is_universal  # returns true if the set is universal

    $s->null          # the null set
    $s->empty         # alias for null
    $s->universe      # the universe of the set

Deriving

    $u = $s->union($t);
    $i = $s->intersection($t);
    $d = $s->difference($t);
    $e = $s->symmetric_difference($t);
    $v = $s->unique($t);
    $c = $s->complement;

These methods have operator overloads:

    $u = $s + $t; # union
    $i = $s * $t; # intersection
    $d = $s - $t; # difference
    $e = $s % $t; # symmetric_difference
    $v = $s / $t; # unique
    $c = -$s;     # complement

Both the symmetric_difference and unique are symmetric on all their arguments. For two sets they are identical but for more than two sets beware: symmetric_difference returns true for elements that are in an odd number (1, 3, 5, ...) of sets, unique returns true for elements that are in one set.

Some examples of the various set differences:

    set or difference                   value

    $a                                  (a b c d e)
    $b                                  (c d e f g)
    $c                                  (e f g h i)

    $a->difference($b)                  (a b)
    $a->symmetric_difference($b)        (a b f g)
    $a->unique($b)                      (a b f g)

    $b->difference($a)                  (f g)
    $b->symmetric_difference($a)        (a b f g)
    $b->unique($a)                      (a b f g)

    $a->difference($b, $c)              (a b)
    $a->symmetric_difference($b, $c)    (a b e h i)
    $a->unique($b, $c)                  (a b h i)

Comparing

    $eq = $s->is_equal($t);
    $dj = $s->is_disjoint($t);
    $pi = $s->is_properly_intersecting($t);
    $ps = $s->is_proper_subset($t);
    $pS = $s->is_proper_superset($t);
    $is = $s->is_subset($t);
    $iS = $s->is_superset($t);

    $cmp = $s->compare($t);

The compare method returns a string from the following list: "equal", "disjoint", "proper subset", "proper superset", "proper intersect", and in future (once I get around implementing it), "disjoint universes".

These methods have operator overloads:

    $eq = $s == $t; # is_equal
    $dj = $s != $t; # is_disjoint
    # No operator overload for is_properly_intersecting.
    $ps = $s < $t;  # is_proper_subset
    $pS = $s > $t;  # is_proper_superset
    $is = $s <= $t; # is_subset
    $iS = $s >= $t; # is_superset

    $cmp = $s <=> $t;

Boolean contexts

In Boolean contexts such as

    if ($set) { ... }
    while ($set1 && $set2) { ... }

the size of the $set is tested, so empty sets test as false, and non-empty sets as true.

Iterating

    while (defined(my $e = $s->each)) { ... }

This is more memory-friendly than

    for my $e ($s->elements) { ... }

which would first construct the full list of elements and then walk through it: the $s->each handles one element at a time.

Analogously to using normal each(%hash) in scalar context, using $s->each has the following caveats:

  • The elements are returned in (apparently) random order. So don't expect any particular order.

  • When no more elements remain undef is returned. Since you may one day have elements named 0 don't test just like this

        while (my $e = $s->each) { ... }          # WRONG

    but instead like this

        while (defined(my $e = $s->each)) { ... } # right

    (An undef as a set element doesn't really work, you get "".)

  • There is one iterator per one set which is shared by many element-accessing interfaces-- using the following will reset the iterator: elements(), insert(), members(), size(), unique(). insert() causes the iterator of the set being inserted (not the set being the target of insertion) becoming reset. unique() causes the iterators of all the participant sets becoming reset. The iterator getting reset most probably causes an endless loop. So avoid doing that.

  • Modifying the set during the iteration may cause elements to be missed or duplicated, or in the worst case, an endless loop; so don't do that, either.

Customising Display

If you want to customise the display routine you will have to modify the as_string callback. You can modify it either for all sets by using as_string_callback() as a class method:

    my $class_callback = sub { ... };

    Set::Scalar->as_string_callback($class_callback);

or for specific sets by using as_string_callback() as an object method:

    my $callback = sub  { ... };

    $s1->as_string_callback($callback);
    $s2->as_string_callback($callback);

The anonymous subroutine gets as its first (and only) argument the set to display as a string. For example to display the set $s as a-b-c-d-e instead of (a b c d e)

    $s->as_string_callback(sub{join("-",sort $_[0]->elements)});

If called without an argument, the current callback is returned.

If called as a class method with undef as the only argument, the original callback (the one returning (a b c d e)) for all the sets is restored, or if called for a single set the callback is removed (and the callback for all the sets will be used).

CAVEATS

The first priority of Set::Scalar is to be a convenient interface to sets. While not designed to be slow or big, neither has it been designed to be fast or compact.

SEE ALSO

Set::Bag for bags (multisets, counted sets), and Bit::Vector for fast set operations (you have to take care of the element name to bit number and back mappings yourself), or Set::Infinite for sets of intervals, and many more. CPAN is your friend.

AUTHOR

Jarkko Hietaniemi <jhi@iki.fi>

COPYRIGHT AND LICENSE

Copyright 2001,2002,2003,2004 by Jarkko Hietaniemi

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.