Stuart Watt > Search-FreeText-0.05 > Search::FreeText::LexicalAnalysis::Stem

Download:
Search-FreeText-0.05.tar.gz

Dependencies

Annotate this POD

CPAN RT

Open  0
Report a bug
Source  

NAME ^

Search::FreeText::LexicalAnalysis::Stem - lexicon interface to Lingua::Stem

DESCRIPTION ^

A filter which uses Lingua::Stem to implement the Porter stemming algorithm. This can then be included in a search system as a part of the indexing and query system.

The filter is wrapped up a bit. This is because Lingua::Stem turns nonwords into absolutely nothing at all. To overcome this, we only stem words, and merge nonwords back in after they have been stemmed.

SYNOPSIS ^

 my $stemmer = new Search::FreeText::LexicalAnalysis::Stem ();
 my $words = $lexicaliser->process($oldwords);

METHODS ^

$self->initialize();

Called when the lexicon system is initialised. This method actually creates and stores the stemmer, and can be overridden if needed.

$self->process($oldwords);

Called to process a reference to an array of words, and returns a reference to an array of stemmed words for further processing. Words that are not stemmable are left in place, which is a slight performance hit as we need to wrap Lingua::Stem, but these are real words for indexing so we mustn't just lose them!

AUTHOR ^

Stuart Watt <S.N.K.Watt@rgu.ac.uk>

Copyright (c) 2003 The Robert Gordon University. All rights reserved.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.