☺唐鳳☻ > Lingua-ZH-HanDetect > Lingua::ZH::HanDetect

Download:
Lingua-ZH-HanDetect-0.04.tar.gz

Dependencies

Annotate this POD

Related Modules

Lingua::Identify
more...
By perlmonks.org
View/Report Bugs
Module Version: 0.04   Source  

NAME ^

Lingua::ZH::HanDetect - Guess Chinese text's variant and encoding

VERSION ^

This document describes version 0.04 of Lingua::ZH::HanDetect, released June 27, 2003.

SYNOPSIS ^

    use Lingua::ZH::HanDetect;

    # $encoding is 'big5-hkscs', 'big5', 'gbk', 'euc-cn', 'utf8' or ''
    # $variant  is 'traditional', 'simplified' or ''
    my ($encoding, $variant) = han_detect($some_chinese_text);

DESCRIPTION ^

Lingua::ZH::HanDetect uses statistical measures to test a text string to see if it's in Traditional or Simplified Chinese, as well as which encoding it is in.

If the string does not contain Chinese characters, both the encoding and variant values will be set to the empty string.

This module is needed because the various encodings for Chinese text tend to occupy the similar byte ranges, rendering Encode::Guess ineffective.

SEE ALSO ^

Encode::HanDetect

AUTHORS ^

Autrijus Tang <autrijus@autrijus.org>

COPYRIGHT ^

Copyright 2003 by Autrijus Tang <autrijus@autrijus.org>.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

See http://www.perl.com/perl/misc/Artistic.html

syntax highlighting: