Sort::Key::Merger - Perl extension for merging sorted things
use Sort::Key::Merger qw(keymerger); sub line_key_value { # $_[0] is available as a scratchpad that persist # between calls for the same $_; unless (defined $_[0]) { # so we use it to cache the file handle when we # open a file on the first read open $_[0], "<", $_ or croak "unable to open $_"; } # don't get confused by this while loop, it's only # used to ignore empty lines my $fh = $_[0]; local $_; # break $_ aliasing; while (<$fh>) { next if /^\s*$/; chomp; if (my ($key, $value) = /^(\S+)\s+(.*)$/) { return ($value, $key) } warn "bad line $_" } # signals the end of the data by returning an # empty list () } # create a merger object: my $merger = keymerger { line_key_value } @ARGV; # sort and write the values: my $value; while (defined($value=$merger->())) { print "value: $value\n" }
Several backward imcompatible changes has been introduced in version 0.10:
- filekeymerger callbacks are now called on list context - order of return values on keymerger callback has changed - in list context only the next value is returned by default instead of all the remaining ones
Sort::Key::Merger merges presorted collections of data based on some (calculated) keys.
Given
The following functions are available from this module:
creates a merger object for the given @sources collections.
@sources
Every item in @source is aliased by $_ and then the user defined subroutine GENERATE_VALUE_KEY_PAIR called. The result from that callback should be a (value, key) pair. Keys are used to determine the order in which the values are sorted.
@source
GENERATE_VALUE_KEY_PAIR
GENERATE_VALUE_KEY_PAIR can return an empty list to indicate that a source has become exhausted.
The result from keymerger is another subroutine that works as a generator. It can be called as:
keymerger
my $next = $merger->(); my @next = $merger->($n);
In scalar context it returns the next value or undef if all the sources have been exhausted. In list context it returns the next $n values (1 is used as the deault value for $n).
If your data can contain undef values, you should iterate over the sorted values as follows:
my $merger = keymerger ...; while (my ($next) = $merger->()) { # do whatever with $next # ... }
Passing -1 makes the function return all the remaining values:
my @remaining = $merger->(-1);
NOTE: an additional argument is passed to the GENERATE_VALUE_KEY_PAIR callback in $_[0]. It is to be used as a scrachpad, its value is associated to the current source and will perdure between calls from the same generator, i.e.:
$_[0]
my $merger = keymerger { # use $_[0] to cache an open file handler: $_[0] or open $_[0], '<', $_ or croak "unable to open $_"; my $fh = $_[0]; local $_; while (<$fh>) { chomp; return $_ => $_; } (); } ('/tmp/foo', '/tmp/bar');
This function honours the use locale pragma.
use locale
is like keymerger but compares the keys numerically.
This function honours the use integer pragma.
use integer
Similar to keymerger but Compares the keys as integers.
Compares the keys as unsigned integers.
performs the sorting in reverse order.
returns a merger subroutine that returns lines read from @files sorted by the keys that generate_key generates.
@files
generate_key
@files can contain file names or handles for already open files.
generate_key is called with the line just read on $_ and has to return the sorting key for it. If its return value is undef the line is ignored.
$_
undef
The line can be modified inside generate_key changing $_, i.e.:
my $merger = filekeymerger { chomp($_); # <== here return undef if /^\s*$/; substr($_, -1, 10) } @ARGV;
Finally, $/ can be changed from its default value to read the files in chunks other than lines.
$/
The return value from this function is a subroutine reference that on successive calls returns the sorted elements in the same fashion as the iterator returned from keymerger.
my $merger = filekeymerger { (split)[0] } @ARGV; while (my ($next) = $merger->(1)) { ... }
is like filekeymerger but the keys are compared numerically.
filekeymerger
similar to filekeymerger bug compares the keys as integers.
similar to filekeymerger bug compares the keys as unsigned integers.
perform the sorting in reverse order.
This function generates a multikey merger.
GENERATE_VALUE_KEYS_LIST should return a list with the next value from the source passed in $_ and the sorting keys.
GENERATE_VALUE_KEYS_LIST
@types is an array with the key sorting types (ee Sort::Key multikey sorting documentation for a discussion on the supported types).
@types
For instance:
my $merger = multikeymerger { my $v = shift $@_; my $name = $v->name; my $age = $v->age; ($v, $age, $name) } [qw(-integer string)], @data_sources; while (my ($next) = $merger->()) { print "$next\n"; }
Sort::Key, Sort::Key::External, locale, integer, perl core sort function.
Copyright (C) 2005, 2007 by Salvador Fandiño, <sfandino@yahoo.com>.
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.4 or, at your option, any later version of Perl 5 you may have available.
To install Sort::Key::Merger, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Sort::Key::Merger
CPAN shell
perl -MCPAN -e shell install Sort::Key::Merger
For more information on module installation, please visit the detailed CPAN module installation guide.