Alexander Farber > Lingua-RU-Charset-0.02 > Lingua::RU::Charset

Download:
Lingua-RU-Charset-0.02.tar.gz

Dependencies

Annotate this POD

CPAN RT

Open  1
View/Report Bugs
Module Version: 0.02   Source  

NAME ^

Lingua::RU::Charset - Perl extension for detecting and converting various russian character sets: KOI8-r, Windows-1251, CP866, ISO-8859-5, X-Mac-Cyrillic, russian text in english letters, russian part of Unicode and UTF-8. This module can be especially useful for computers with broken cyrillic locales (like foreign web hosts).

SYNOPSIS ^

  use Lingua::RU::Charset qw (:CHARSET);
  use Lingua::RU::Charset qw (:CONVERT);
  use Lingua::RU::Charset qw (:CONVERT :CHARCASE);
  use Lingua::RU::Charset qw (any2koi koi2lc koi2uc);

DESCRIPTION ^

More documentation and examples coming soon...

NOTE ^

Unfortunately I don't have time to implement the Unicode and UTF-8 subroutines. But I am sure that such functions would be useful for interesting Perl scripts exchanging russian data with Java servlets. So you are welcome to submit some code!

AUTHOR ^

Alex Farber, <alex@kawo2.rwth-aachen.de>

SEE ALSO ^

"The Cyrillic Charset Soup" article by Roman Czyborra located at http://czyborra.com/charsets/cyrillic.html lists various cyrillic charsets. The russian texts for counting frequencies of letter pairs have been taken from "The Eugene Peskin's Electronic Library" located at http://www.online.ru/sp/rel/russian/ Please consider also visiting my home page at http://simplex.ru/news/ where I collect links to articles and news about Perl, Python, JavaScript, databases etc.

syntax highlighting: