The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Lingua::RU::Antimat - Perl Module for removal Russian slang from chat, guestbooks, etc.

SYNOPSIS

use POSIX qw(locale_h);

use Lingua::RU::Antimat;

use locale;

setlocale(LC_CTYPE,"ru_RU.CP1251");

$dirty_text='text with slang';

$mat= Lingua::RU::Antimat->new;

#load dictionary with additional words

$mat->load_dict('/home/www/badwords');

$mat->set_bip('Sorry!');

$clean_text=$mat->remove_slang($dirty_text);

RUSSIAN DOCUMENTATION

Detailed Russian documentation and tutorial available on http://www.tcen.ru/antimat

DESCRIPTION

This module will remove Russian slang from a string. 'Mat' is Russian name for such bad words and that is why this module is called Antimat.

$mat=Lingua::RU::Antimat->new($codepage);

This method creates a new object and returns it. If new() is called without any arguments, the module will use templates for text in encoding win-1251. If your text in encoding KOI8-R set $codepage equal 'koi8'.

Examples:

$mat=Lingua::RU::Antimat->new; #for text in win-1251

$mat=Lingua::RU::Antimat->new('koi8'); #for text in KOI8-R

$clean_text=$mat->remove_slang($dirty_text);

Method remove_slang takes string and returns string where all bad words replaced on Russian analog 'bip' or string you set in method set_bip which is described later.

$badwords=$mat->detect_slang($dirty_text);

Method detect_slang takes string and returns boolean value. This value equal 1 if there is bad word in the string and 0 if there is no such words in the string.

$mat->set_bip($bip);

Set the string (usually word) which will replace bad words in method remove_slang.

Examples:

$mat->set_bip(''); #let strip out slang

$mat->set_bip('I am sorry!'); #long but also correct

$mat->load_dict($file);

This method loads dictionary with additional bad words. Each string in the dictionary should be a word or regular expression. $file could be relative or absolute path to the dictionary.

SEE ALSO

Detailed Russian documentation on http://www.tcen.ru/antimat

perllocale manpage

CREDITS

Andrey Skorohod, marlenus@marlenus.com for his bug reports. Vladimir Zhdanov, vovka@lg.kamaz.net for his bug report. Andrey Sharapov, Sharapov@tut.by for his suggestions. Yury Voloshin, xtc@norilsk.net for his bug report and suggestions.

Thanks!

AUTHOR

Ilya Soldatkin, arc@tcen.ru

Drop me a line if you deploy this module on your site. Think about this as a small contribution to my efforts for writing and supporting this module. I can not improve this module if I will know that no one uses it.

COPYRIGHT

Copyright 2001-2003 Ilya Soldatkin. All rights reserved.

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.