Stuart Watt > Search-FreeText-0.05 > Search::FreeText::LexicalAnalysis::Tokenize

Download:
Search-FreeText-0.05.tar.gz

Dependencies

Annotate this POD

CPAN RT

Open  0
Report a bug
Source  

NAME ^

Search::FreeText::LexicalAnalysis::Tokenize - lexicon tokenizer

DESCRIPTION ^

A pseudo-filter which should always be called as the first element in the lexical processing system. As usual, it can also be overridden. Called with an array containing an entire string, it returns a new array containing a list of words.

SYNOPSIS ^

 my $stemmer = new Search::FreeText::LexicalAnalysis::Tokenize ();
 my $words = $lexicaliser->process($oldwords);

METHODS ^

$self->initialize();

Called when the lexicon system is initialised. This method actually does very little, although it could compile and cache stuff if it seemed appropriate.

$self->process($oldwords);

Called to process a reference to an array containing strings (well, one string) which can then be tokenized for further lexical processing.

AUTHOR ^

Stuart Watt <S.N.K.Watt@rgu.ac.uk>

Copyright (c) 2003 The Robert Gordon University. All rights reserved.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.