The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Search::Tools::HeatMap - locate the best matches in a snippet extract

SYNOPSIS

 use Search::Tools::Tokenizer;
 use Search::Tools::HeatMap;
     
 my $tokens = $self->tokenizer->tokenize( $my_string, qr/^(interesting)$/ );
 my $heatmap = Search::Tools::HeatMap->new(
     tokens         => $tokens,
     window_size    => 20,  # default
     as_sentences   => 0,   # default
 );

 if ( $heatmap->has_spans ) {
 
     my $tokens_arr = $tokens->as_array;

     # stringify positions
     my @snips;
     for my $span ( @{ $heatmap->spans } ) {
         push( @snips, $span->{str} );
     }
     my $occur_index = $self->occur - 1;
     if ( $#snips > $occur_index ) {
         @snips = @snips[ 0 .. $occur_index ];
     }
     printf("%s\n", join( ' ... ', @snips ));
     
 }

DESCRIPTION

Search::Tools::HeatMap implements a simple algorithm for locating the densest clusters of unique, hot terms in a TokenList.

HeatMap is used internally by Snipper but documented here in case someone wants to abuse and/or improve it.

METHODS

new( tokens => TokenList )

Create a new HeatMap. The TokenList object may be either a Search::Tools::TokenList or Search::Tools::TokenListPP object.

init

Builds the HeatMap object. Called internally by new().

window_size

The max width of a span. Defaults to 20 tokens, including the matches.

Set this in new(). Access it later if you need to, but the spans will have already been created by new().

as_sentences

Try to match clusters at sentence boundaries. Default is false.

Set this in new().

spans

Returns an array ref of matching clusters. Each span in the array is a hash ref with the following keys:

cluster
pos
heat
str
str_w_pos

This item is available only if debug() is true.

unique

has_spans

Returns the number of spans found.

AUTHOR

Peter Karman <karman at cpan dot org>

ACKNOWLEDGEMENTS

The idea of the HeatMap comes from KinoSearch, though the implementation here is original.

BUGS

Please report any bugs or feature requests to bug-search-tools at rt.cpan.org, or through the web interface at http://rt.cpan.org/NoAuth/ReportBug.html?Queue=Search-Tools. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.

SUPPORT

You can find documentation for this module with the perldoc command.

    perldoc Search::Tools

You can also look for information at:

COPYRIGHT

Copyright 2009 by Peter Karman.

This package is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

SEE ALSO

KinoSearch