Father Chrysostomos > KSx-Analysis-StripAccents > KSx::Analysis::StripAccents

Download:
KSx-Analysis-StripAccents-0.05.tar.gz

Dependencies

Annotate this POD

View/Report Bugs
Module Version: 0.05   Source  

NAME ^

KSx::Analysis::StripAccents - Remove accents and fold to lowercase

VERSION ^

0.05 (beta)

SYNOPSIS ^

    my $stripper = KSx::Analysis::StripAccents->new;

    my $polyanalyzer = KinoSearch::Analysis::PolyAnalyzer->new(
        analyzers => [ $tokenizer, $stripper, $stemmer ],
    );

DESCRIPTION ^

This analyser strips accents from its input, removes accents, and converts it to lowercase. It may end up changing the length of a token, so make sure that this analyser is not used before a tokenizer.

CONSTRUCTOR ^

new

Construct a new accent-stripping analyser.

PREREQUISITES ^

This module requires perl and the following modules, which you can get from the CPAN:

Text::Unaccent

KinoSearch 0.2 or later

AUTHOR & COPYRIGHT ^

Copyright (C) Father Chrysostomos

This program is free software; you may redistribute or modify it (or both) under the same terms as perl.

SEE ALSO ^

KinoSearch::Analysis::Analyzer (the base class)

KinoSearch::Analysis::LCNormalizer (which this module was based on, and is intended as a drop-in replacement for)

KinoSearch::Analysis::CaseFolder (what LCNormalizer has been renamed in the dev branch of KinoSearch)

KinoSearch

syntax highlighting: