Tony Bowden > Plucene > Plucene::Analysis::LetterTokenizer

Download:
Plucene-1.25.tar.gz

Dependencies

Annotate this POD

CPAN RT

New  12
Open  5
View/Report Bugs
Source  

NAME ^

Plucene::Analysis::LetterTokenizer - Letter tokenizer

SYNOPSIS ^

        # isa Plucene::Analysis::CharTokenizer

DESCRIPTION ^

This is the letter tokenizer class, which divides text at non-letters.

Note: this does a decent job for most European languages, but does a terrible job for some Asian languages, where words are not separated by spaces

syntax highlighting: