KSx::Analysis::StripAccents - Remove accents and fold to lowercase
0.05 (beta)
my $stripper = KSx::Analysis::StripAccents->new; my $polyanalyzer = KinoSearch::Analysis::PolyAnalyzer->new( analyzers => [ $tokenizer, $stripper, $stemmer ], );
This analyser strips accents from its input, removes accents, and converts it to lowercase. It may end up changing the length of a token, so make sure that this analyser is not used before a tokenizer.
Construct a new accent-stripping analyser.
This module requires perl and the following modules, which you can get from the CPAN:
Text::Unaccent
KinoSearch 0.2 or later
Copyright (C) Father Chrysostomos
This program is free software; you may redistribute or modify it (or both) under the same terms as perl.
KinoSearch::Analysis::Analyzer (the base class)
KinoSearch::Analysis::LCNormalizer (which this module was based on, and is intended as a drop-in replacement for)
KinoSearch::Analysis::CaseFolder (what LCNormalizer has been renamed in the dev branch of KinoSearch)
KinoSearch
To install KSx::Analysis::StripAccents, copy and paste the appropriate command in to your terminal.
cpanm
cpanm KSx::Analysis::StripAccents
CPAN shell
perl -MCPAN -e shell install KSx::Analysis::StripAccents
For more information on module installation, please visit the detailed CPAN module installation guide.