NAME
Char::KOI8U - Source code filter for KOI8-U script (Imitation JPerl)
SYNOPSIS
In your script:
use Char::KOI8U; # CPAN formal style
or
use KOI8U; # casual style
At command prompt:
perl yourscript.pl wild* *card and '*quote*' on MSWin32
DESCRIPTION
This software is "JPerl on the Modern Perl" written by Perl.
This software treats KOI8-U directly. Therefore, there is not UTF8 flag.
INSTALLATION BY MAKE (for UNIX like system)
To install this software by make, type the following:
perl Makefile.PL
make
make test
make install
Rename and install strict.pm_ of this distribution to strict.pm if your system
doesn't have strict.pm.
INSTALLATION WITHOUT MAKE (for DOS like system)
To install this software without make, type the following:
perl pMakefile.PL --- pMakefile.PL makes "pmake.bat" only, and ...
pmake.bat
pmake.bat test
pmake.bat install --- install to current using Perl
Rename and install strict.pm_ of this distribution to strict.pm if your system
doesn't have strict.pm.
pmake.bat dist --- make distribution package
pmake.bat ptar.bat --- make perl script "ptar.bat"
DEPENDENCIES
This software requires perl5.00503 or later.
LICENSE AND COPYRIGHT
This software is free software; you can redistribute it and/or
modify it under the same terms as Perl itself. See perlartistic.
This software is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
MAINTAINER
This project was originated by INABA Hitoshi <ina@cpan.org>.
KOI8-U (2011.07.24 12:57:00 JST). In Wikipedia: The Free Encyclopedia.
Retrieved from
http://en.wikipedia.org/wiki/KOI8-U
KOI8-U is an 8-bit character encoding, designed to cover Ukrainian, which
uses the Cyrillic alphabet. It is based on KOI8-R, which covers Russian and
Bulgarian, but replaces eight graphic characters with four Ukrainian letters
GHE WITH UPTURN, UKRAINIAN IE, BYELORUSSIAN-UKRAINIAN I and YI(UKRAINIAN)
in both upper case and lower case.
In Microsoft Windows, KOI8-U is assigned the code page number 21866.
In IBM, KOI8-U is assigned code page 1168.
KOI8 remains much more commonly used than ISO 8859-5, which never really
caught on. Another common Cyrillic character encoding is Windows-1251.
In the future, both may eventually give way to Unicode.
In Russian, KOI8 stands for Kod Obmena Informatsiey, 8 bit which means
"Code for Information Exchange, 8 bit".
The KOI8 character sets have the property that the Russian Cyrillic letters
are in pseudo-Roman order rather than the natural Cyrillic alphabetical
order as in ISO 8859-5. Although this may seem unnatural, it has the useful
property that if the 8th bit is stripped, the text can still be read
(or at least deciphered) in case-reversed transliteration on an ordinary
ASCII terminal.
* ALGORITHM #1
When the character is taken out of the octet string, it is necessary to
distinguish a single octet character and the multiple octet character.
The distinction is done only by first octet.
Single octet code is:
0x00-0xFF
See also code table:
Single octet code
0 1 2 3 4 5 6 7 8 9 A B C D E F
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
0|*|*|*|*|*|*|*|*|*|*|*|*|*|*|*|*| 0x00-0xFF
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1|*|*|*|*|*|*|*|*|*|*|*|*|*|*|*|*|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
2|*|*|*|*|*|*|*|*|*|*|*|*|*|*|*|*|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
3|*|*|*|*|*|*|*|*|*|*|*|*|*|*|*|*|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
4|*|*|*|*|*|*|*|*|*|*|*|*|*|*|*|*|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
5|*|*|*|*|*|*|*|*|*|*|*|*|*|*|*|*|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
6|*|*|*|*|*|*|*|*|*|*|*|*|*|*|*|*|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
7|*|*|*|*|*|*|*|*|*|*|*|*|*|*|*|*|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
8|*|*|*|*|*|*|*|*|*|*|*|*|*|*|*|*|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
9|*|*|*|*|*|*|*|*|*|*|*|*|*|*|*|*|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
A|*|*|*|*|*|*|*|*|*|*|*|*|*|*|*|*|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
B|*|*|*|*|*|*|*|*|*|*|*|*|*|*|*|*|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
C|*|*|*|*|*|*|*|*|*|*|*|*|*|*|*|*|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
D|*|*|*|*|*|*|*|*|*|*|*|*|*|*|*|*|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
E|*|*|*|*|*|*|*|*|*|*|*|*|*|*|*|*|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
F|*|*|*|*|*|*|*|*|*|*|*|*|*|*|*|*|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
* ALGORITHM #2
Against algorithm.1, when the range of the character by tr/// is specified,
only the following character codes are effective.
Single octet code is:
0x00-0xFF
See also code table:
Single octet code
0 1 2 3 4 5 6 7 8 9 A B C D E F
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
0|*|*|*|*|*|*|*|*|*|*|*|*|*|*|*|*| 0x00-0xFF
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1|*|*|*|*|*|*|*|*|*|*|*|*|*|*|*|*|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
2|*|*|*|*|*|*|*|*|*|*|*|*|*|*|*|*|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
3|*|*|*|*|*|*|*|*|*|*|*|*|*|*|*|*|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
4|*|*|*|*|*|*|*|*|*|*|*|*|*|*|*|*|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
5|*|*|*|*|*|*|*|*|*|*|*|*|*|*|*|*|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
6|*|*|*|*|*|*|*|*|*|*|*|*|*|*|*|*|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
7|*|*|*|*|*|*|*|*|*|*|*|*|*|*|*|*|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
8|*|*|*|*|*|*|*|*|*|*|*|*|*|*|*|*|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
9|*|*|*|*|*|*|*|*|*|*|*|*|*|*|*|*|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
A|*|*|*|*|*|*|*|*|*|*|*|*|*|*|*|*|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
B|*|*|*|*|*|*|*|*|*|*|*|*|*|*|*|*|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
C|*|*|*|*|*|*|*|*|*|*|*|*|*|*|*|*|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
D|*|*|*|*|*|*|*|*|*|*|*|*|*|*|*|*|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
E|*|*|*|*|*|*|*|*|*|*|*|*|*|*|*|*|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
F|*|*|*|*|*|*|*|*|*|*|*|*|*|*|*|*|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
SEE ALSO
perl, KOI8U.pm, Ekoi8u.pm, jacode.pl