The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Lingua::ZH::MacChinese::Traditional - transcoding between Mac OS Chinese Traditional encoding and Unicode

SYNOPSIS

(1) using function names exported by default:

    use Lingua::ZH::MacChinese::Traditional;
    $wchar = decodeMacChineseTrad($octet);
    $octet = encodeMacChineseTrad($wchar);

(2) using function names exported on request:

    use Lingua::ZH::MacChinese::Traditional qw(decode encode);
    $wchar = decode($octet);
    $octet = encode($wchar);

(3) using function names fully qualified:

    use Lingua::ZH::MacChinese::Traditional ();
    $wchar = Lingua::ZH::MacChinese::Traditional::decode($octet);
    $octet = Lingua::ZH::MacChinese::Traditional::encode($wchar);

   # $wchar : a string in Perl's Unicode format
   # $octet : a string in Mac OS Chinese Traditional encoding

DESCRIPTION

This module provides transcoding from/to Mac OS Chinese Traditional encoding (denoted MacChineseTrad hereafter).

In order to ensure roundtrip mapping, MacChineseTrad encoding has some characters with mapping from a single MacChineseTrad character to a sequence of Unicode characters and vice versa. Such characters include 0x80 (MacChineseTrad) from/to 0x005C+0xF87F (Unicode) for "REVERSE SOLIDUS, alternate".

This module provides functions to transcode between MacChineseTrad and Unicode, without information loss for every MacChineseTrad character.

Functions

$wchar = decode($octet)
$wchar = decode($handler, $octet)
$wchar = decodeMacChineseTrad($octet)
$wchar = decodeMacChineseTrad($handler, $octet)

Converts MacChineseTrad to Unicode.

decodeMacChineseTrad() is an alias for decode() exported by default.

If the $handler is not specified, any MacChineseTrad character that is not mapped to Unicode is deleted; if the $handler is a code reference, a string returned from that coderef is inserted there. if the $handler is a scalar reference, a string (a PV) in that reference (the referent) is inserted there.

The 1st argument for the $handler coderef is a string of the unmapped MacChineseTrad character (e.g. "\xFC\xFE").

$octet = encode($wchar)
$octet = encode($handler, $wchar)
$octet = encodeMacChineseTrad($wchar)
$octet = encodeMacChineseTrad($handler, $wchar)

Converts Unicode to MacChineseTrad.

encodeMacChineseTrad() is an alias for encode() exported by default.

If the $handler is not specified, any Unicode character that is not mapped to MacChineseTrad is deleted; if the $handler is a code reference, a string returned from that coderef is inserted there. if the $handler is a scalar reference, a string (a PV) in that reference (the referent) is inserted there.

The 1st argument for the $handler coderef is the Unicode code point (unsigned integer) of the unmapped character.

E.g.

   sub hexNCR { sprintf("&#x%x;", shift) } # hexadecimal NCR
   sub decNCR { sprintf("&#%d;" , shift) } # decimal NCR

   print encodeMacChineseTrad("ABC\x{100}\x{10000}");
   # "ABC"

   print encodeMacChineseTrad(\"", "ABC\x{100}\x{10000}");
   # "ABC"

   print encodeMacChineseTrad(\"?", "ABC\x{100}\x{10000}");
   # "ABC??"

   print encodeMacChineseTrad(\&hexNCR, "ABC\x{100}\x{10000}");
   # "ABCĀ𐀀"

   print encodeMacChineseTrad(\&decNCR, "ABC\x{100}\x{10000}");
   # "ABCĀ𐀀"

CAVEAT

Sorry, the author is not working on a Mac OS. Please let him know if you find something wrong.

AUTHOR

SADAHIRO Tomoyuki <SADAHIRO@cpan.org>

Copyright(C) 2003-2007, SADAHIRO Tomoyuki. Japan. All rights reserved.

This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

SEE ALSO

Map (external version) from Mac OS Chinese Traditional encoding to Unicode 2.1 and later (version: c02 2005-Apr-04)

http://www.unicode.org/Public/MAPPINGS/VENDORS/APPLE/CHINTRAD.TXT

Registry (external version) of Apple use of Unicode corporate-zone characters (version: c03 2005-Apr-04)

http://www.unicode.org/Public/MAPPINGS/VENDORS/APPLE/CORPCHAR.TXT