Frank J Wojcik > Heap-Priority > Heap::Priority

Download:
Heap-Priority-0.11.tar.gz

Dependencies

Annotate this POD

CPAN RT

Open  0
Report a bug
Module Version: 0.11   Source  

NAME ^

Heap::Priority - Implements a priority queue or stack

SYNOPSIS ^

    use Heap::Priority;
    my $h = new Heap::Priority;

    $h->add($item,[$priority]); # add an item to the list
    $item = $h->pop;            # get an item back from the list

    $h->fifo;                   # set first in first out (ie. a queue) (default)
    $h->lilo;                   # synonym for fifo()
    $h->lifo;                   # set last in first out (ie. a stack)
    $h->filo;                   # synonym for lifo()
    $h->highest_first;          # set pop() in high to low priority order (default)
    $h->lowest_first;           # set pop() in low to high priority order

    $numitems  = $h->count;     # get the number of items in the list
    $priority  = $h->next_level;# get the next item's priority that would be popped
    $item      = $h->next_item; # get the next item that would be popped
    @levels    = $h->get_priority_levels;  # get the existing priority levels
    @items     = $h->get_level($priority); # get the items at a specific level
    @all_items = $h->get_list;  # get all the items on the list
    @all_items = $h->get_heap;  # synonym for get_list()

    $h->delete_item($item,[$priority]);    # delete an item from the list
    $h->delete_priority_level($priority);  # delete all items at a priority level

    $h->raise_error($err_level);# change the severity of errors
    $error_string = $h->err_str;# get the error strings
    $h->reset_err_str;          # clear the error strings

DESCRIPTION ^

This module implements a priority queue or stack. It will be referred to here as a 'priority list' or 'priority heap'. The main functions are add() and pop() which add and remove from the priority list according to the rules you choose. When you add() an item to the priority list you can assign a priority level to the item or let the priority level default to 0. Multiple instances of the same item are permitted in the priority list, either on the same or different priority levels. A priority level must be a number, positive or negative. Basically, any number you can fit into a scalar. If you use floats, though, watch out for stringification issues.

What happens when you call pop() depends on the configuration you choose. By default the highest priority items will be popped off in First-In-First-Out order. fifo() and lifo() set First-in-First-Out and Last-In-First-Out respectively. lilo() and filo() are synonyms for fifo() and lifo() respectively. Calling fifo(), filo(), lifo(), or lilo() at any time is legal, and will not change the contents of the list.

highest_first() and lowest_first() allow you to choose to pop() the highest priority items first or the lowest priority items first. In the current implementation, this operation can be relatively slow, depending on the number of different priority levels in the priority list.

There are several functions to inspect the priority list. None of these will change the priority list in any way. count() will return the number of items remaining on the priority list. Information about the next item that pop() will pull off of the priority list can be gotten via next_level() and next_item(). get_priority_levels() will return the list of priority levels which exist (have items assigned to them) in the priority list. get_level() will return the list of items at a given priority level. get_list() will return the complete set of items in the priority list. get_heap() is a synonym for get_list(); see "INTERFACE COMPATIBILITY". Items gotten via get_level(), get_list(), or get_heap() will be returned in the order they will/would be returned via pop().

There are two ways to manipulate the priority list, outside of the usual pop(). delete_item() will delete all instances of an item at a given priority level. If a priority level is not specified, then all instances of the item at every priority level will be deleted. This can be very slow. See "EFFICIENCY". delete_priority_level() will delete all items at a given priority level. These methods are not deprecated, but work "outside" of the usual purpose of a priority list. If you find your program using them, perhaps you should consider using a different data structure.

Error handling is modified via raise_error(). When an error is generated, the default is to record it, and continue on silently. This can be changed by calling raise_error() with an error level. If the error level is set to 1, errors will be warned about (via carp()). If the error level is set to 2, errors will be fatal (via croak()). Default behavior can be reset by setting the error level to 0. Error messages produced can be accessed through err_str(). Error messages can be cleared by calling reset_err_str().

INTERFACE COMPATIBILITY ^

This interface is designed to be almost 100% drop-in compatible with Heap::Priority 0.01 by Dr James Freeman <jfreeman@tassie.net.au>. About 20-30% of the code and documentation is extremely similar to the original package.

The major exception is that the modify_priority() function is not implemented. It is not currently possible to change the priority of an item. To workaround this, you can explicitly do what the previous implementation implicitly did, which is to delete all instances of the item and add in a new one (via delete_item() and add()).

Further, the terminology of that package was somewhat looser than I would have liked, hence the synonyms of 'list' and 'heap'. The error messages returned by err_str() are almost identical; one "on heap!" was changed to an "in heap!". Exact error texts should not be relied upon to stay the same in future revisions.

OBJECT INTERFACE ^

This is an OO module. You begin by creating a new heap object

    use Heap::Priority;
    my $h = new Heap::Priority;

You then simply call methods on your heap object:

    $h->lowest_first();         # set pop() in low to high priority order
    $h->add($item,$priority);   # add an item to the heap
    $h->lifo;                   # set last in first out (ie. a stack)
    my $item = $h->pop;         # get an item back from heap

METHODS ^

new()

    my $h = new Heap::Priority;

The constructor takes no arguments and simply returns an empty default list. The default configuration is FIFO (ie. a queue) with highest priority items popped first

add($item,[$priority])

    $h->add($item, [$priority]);

add() will add $item to the heap. Optionally, it may be assigned a $priority level (default priority level is 0).

pop()

    my $item = $h->pop;

pop() takes no arguments. In default configuration pop() will return those items having the highest priority level first in FIFO order. This behavior can be modified using the methods outlined below.

fifo()

    $h->fifo;

Set pop() to work on a First-In-First-Out basis, otherwise known as a queue. This is the default configuration.

lilo()

    $h->lilo;

This is a synonym for fifo().

lifo()

    $h->lifo;

Set pop() to work on a Last-In-First-Out basis, otherwise known as a stack.

filo()

    $h->filo;

This is a synonym for lifo().

highest_first()

    $h->highest_first;

Set pop() to retrieve items in highest to lowest priority order. This is the default configuration.

lowest_first()

    $h->lowest_first;

Set pop() to retrieve items in lowest to highest priority order.

count()

    my $numitems = $h->count;

Returns the number of items in the priority list. Much lighter weight than counting the items returned by get_list().

next_level()

    my $priority = $h->next_level;

Returns the priority level assigned to the next item that pop() would return. Does not modify the priority list.

next_item()

    my $item = $h->next_item;

Returns the next item that pop() would return. Does not modify the priority list. (Otherwise, it would be identical to pop())

get_priority_levels()

    my @levels = $h->get_priority_levels;

Returns the list of priority levels in current pop() order.

get_level($priority)

    my @items = $h->get_level($priority);

Returns the entire list of items assigned to the specified priority level in current pop() order.

get_list()

    my @all_items = $h->get_list;

Returns entire list of items in current pop() order. If you only want the number of items on the list, count() is probably faster.

get_heap()

    my @all_items = $h->get_heap;

Synonym for get_list(). See "INTERFACE COMPATIBILITY".

delete_item($item,[$priority])

    $h->delete_item($item,[$priority]);

This method will delete $item from the heap. If the optional $priority is not supplied all instances of $item will be removed from the heap. If $priority is supplied then only instances of $item at that priority level will be removed.

delete_priority_level($priority)

    $h->delete_priority_level($priority);

Delete all items assigned to $priority level.

raise_error($n)

    $h->raise_error(1);

Set error level $n => 2 = croak, 1 = carp, 0 = silent (default)

err_str()

    my $error_string = $h->err_str;

Return error string, if any.

reset_err_str()

    $h->reset_err_str;

Reset any errors that err_str() would have reported.

EXPORT ^

Nothing: it's an OO module.

IMPLEMENTATION DETAILS ^

Items are stored in lists, one list per priority level. These lists are stored in a hash, with the priority level as the key. The priority levels are stored in a true heap (via Heap::Fibonacci, Heap::Num, and Heap::NumRev). Therefore, theoretically, arbitrary scalars could be used as priority levels with a minimal amount of code change.

Currently, only one Heap::Fibonacci instance is stored persistently. The behavior when highest_first and lowest_first are called on non-empty priority lists is to create a new instance, and move the elements to the new heap, converting their type as appropriate (Heap::Num to/from Heap::NumRev). It would also be possible to concurrently store two heaps, one for highest_first and one for lowest_first. I think I've decided that the first option is better, because changing that particular horse in mid-stream is uncommon enough to justify the memory savings in the more usual cases. I could be convinced this is not true. As it stands, you should probably decide on priority list behavior before adding items.

The reason that modify_priority() is not implemented is that an item's position relative to the others' is not recorded, so strict fifo/lifo behavior could not be guaranteed. This is completely aside from the problem of selecting and specifying which instance of the item to modify, though "all of them" seems a not unreasonable behavior.

There is no test for item existence, no way to count the instances of a particular item on the list, no way to list the priorities an item is assigned to, no way delete specific instances of an item that is on the list multiple times, and no way to query the priority list about its current pop() ordering. Implementing these would require some combination of changing the interface, using a much more memory-hungry implementation, or making those calls very, very slow. I am considering writing a heavier-weight (more memory intensive) version of this called Heap::Priority::Super (or some-such) which does support those features. Another way to solve this problem would be to pass an optimization parameter on object instantiation. This may be simpler from a user point of view, but I think the code would become a mess of if statements. Comments are very welcome.

EFFICIENCY ^

The internal data storage representation was chosen to maximize speed for "usual" operations, while minimizing memory usage. List modification outside of add() and pop() might be slow; in particular, deleting an item completely from the list will require visiting every item on the list.

The representation is efficient for very large numbers of items, regardless of the number of distinct priority levels.

Benchmarks (on a 750 MHz Pentium III, running FreeBSD 4.5 and perl 5.00503) indicate that add() and pop() take about 25 usec, best-case. Real-world testing on the same machine, with approximately 1,000,000 items 10,000 priority levels shows add() taking about 55 usec, and pop() taking about 50 usec, on average. It turns out that perl subroutine/method overhead is substantial compared to the rest of the program, and I was able to obtain significant speedups by 'reaching in' to the objects, bypassing accessor methods. This is not done in this release of the code. A future implementation may not use Heap::Fibonacci, but may hard-code a number-based heap implementation internally.

A complexity analysis of this code, assuming n items evenly distributed over m levels:

O(1)

new()
fifo()/lilo()
filo()/lifo()
count()
next_level()
next_item()

O(log m)

add()
pop()

O(m/n)

get_level()

O(m log m)

highest_first()
lowest_first()
get_priority_levels()
delete_priority_level()

O(m log m + n/m)

delete_item(), specific level

O(m log m + n)

get_list()/get_heap()

O(m^2 log m + n)

delete_item(), all levels

TODO ^

Add better tests for get_priority_levels() and get_level(). Add more tests with multiple items on the same level to test fifo/lifo correctness.

BUGS ^

There's always one bug. If you find it, please let me know.

SEE ALSO ^

Heap(3), Heap::Fibonacci(3), Heap::Elem::Num(3), Heap::Elem::NumRev(3)

AUTHOR ^

Frank J Wojcik <fwojcik+pri@besh.com>