Lingua::Stem::UniNE::DE - German stemmer
This document describes Lingua::Stem::UniNE::DE v0.07.
use Lingua::Stem::UniNE::DE qw( stem_de ); my $stem = stem_de($word); # alternate syntax $stem = Lingua::Stem::UniNE::DE::stem($word);
Light and aggressive stemmers for the German language. The light stemmer removes plural endings and umlauts. The aggressive stemmer also removes inflectional suffixes and additional diacritics.
This module provides the
stem_de functions for the light stemmer, which are synonymous and can optionally be exported, plus
stem_de_aggressive functions for the light stemmer. They accept a single word and return a single stem.
“In proposing stemmers for other languages than English, we think that a ‘light’ stemmer (removing inflections only for noun and adjectives) presents some advantages. […] In German, a few rules may be applied to obtain the plural form of words (e.g., ‘Frau’ into ‘Frauen’ (woman), ‘Bild’ into ‘Bilder’ (picture), ‘Sohn’ into ‘Söhne’ (son), ‘Apfel’ into ‘Äpfel’ (apple)), but the suggested algorithms do not account for person and tense variations, or for the morphological variations used by verbs (we think that indexing verbs for Italian, French or German is not of primary importance compared to nouns and adjectives).” —Jacques Savoy, IR Multilingual Resources at UniNE
“For the German corpus, Porter’s stemmer provided better retrieval performance than did the UniNE scheme (average difference of 3.7% over nine IR models). The difference between these two stemming schemes however was never statistically significant.” —Jacques Savoy, Light Stemming Approaches for the French, Portuguese, German and Hungarian Languages
Lingua::Stem::UniNE provides a stemming object with access to all of the implemented University of Neuchâtel stemmers including this one. It has additional features like stemming lists of words.
Lingua::Stem::Any provides a unified interface to any stemmer on CPAN, including this one, as well as additional features like normalization, casefolding, and in-place stemming.
Nick Patch <email@example.com>
© 2014 Shutterstock, Inc.
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.