NAME
Lingua::Stem::Patch - Patch stemmers for Esperanto and Ido
VERSION
This document describes Lingua::Stem::Patch v0.03.
SYNOPSIS
use Lingua::Stem::Patch;
# create Esperanto stemmer
$stemmer = Lingua::Stem::Patch->new(language => 'eo');
# get stem for word
$stem = $stemmer->stem($word);
# get list of stems for list of words
@stems = $stemmer->stem(@words);
DESCRIPTION
This module contains a collection of stemmers for multiple languages
using the Patch stemming algorithms. The languages currently implemented
are Esperanto and Ido.
This is a new project under active development and the current stemming
algorithms are likely to change.
Attributes
language
The following language codes are currently supported.
┌───────────┬────┐
│ Esperanto │ eo │
│ Ido │ io │
└───────────┴────┘
They are in the two-letter ISO 639-1 format and are case-insensitive
but are always returned in lowercase when requested.
# instantiate a stemmer object
$stemmer = Lingua::Stem::Patch->new(language => $language);
# get current language
$language = $stemmer->language;
# change language
$stemmer->language($language);
aggressive
By default a light stemmer will be used, but when "aggressive" is
set to true, an aggressive stemmer will be used instead.
$stemmer->aggressive(1);
Methods
stem
Accepts a list of words, stems each word, and returns a list of
stems. The list returned will always have the same number of
elements in the same order as the list provided. When no stemming
rules apply to a word, the original word is returned.
@stems = $stemmer->stem(@words);
# get the stem for a single word
$stem = $stemmer->stem($word);
The words should be provided as character strings and the stems are
returned as character strings. Byte strings in arbitrary character
encodings are intentionally not supported.
languages
Returns a list of supported two-letter language codes using
lowercase letters.
# object method
@languages = $stemmer->languages;
# class method
@languages = Lingua::Stem::Patch->languages;
SEE ALSO
Lingua::Stem::Any provides a unified interface to any stemmer on CPAN,
including this module, as well as additional features like
normalization, casefolding, and in-place stemming.
AUTHOR
Nick Patch <patch@cpan.org>
COPYRIGHT AND LICENSE
© 2014 Nick Patch
This library is free software; you can redistribute it and/or modify it
under the same terms as Perl itself.