Data::HanConvert - The data for converting between traditional and simplified Chinese languages.
This distribution does not contain code but data to be used by other programs. They are split into 4 modules that needs to be seperately required.
use Data::HanConvert::cn2tw; use Data::HanConvert::cn2tw_characters; use Data::HanConvert::tw2cn; use Data::HanConvert::tw2cn_characters;
Once required, these corresponding hashref are available:
$Data::HanConvert::cn2tw $Data::HanConvert::cn2tw_characters $Data::HanConvert::tw2cn $Data::HanConvert::tw2cn_characters
The one named with "_characters" suffix contains only character-to-character mapping, while the other contains only phrase-to-phrase mapping. The mapping are split into different files because they are significantly larger and may not be required depending on the scenario of use.
Notice that this data set is for conversion purposes. The phrases dataset are not necessarily containing only valid dictionary phrases, but may contain random long-ngrams solely for disambiguation purposes. Users are encourged to review the data set before using this data for other purposes.
The origial data collection work from Encode::HanConvert
The php builder
src/hanconvert.txt，每一列代表一項的對應，必需有兩欄。第一欄為正體中文，第 二欄位簡體中文。欄位以至少一個空白 (SPC, 0x20) 或跳格 (TAB, 0x09) 分隔。
# 字符為起首，則該列的內容也會被忽略，不計為對照表內容。編修者可利用 以此方式在檔案中加入註解。
hanconvert.txt 應可以容許單一詞出現多重對應， 撰寫處理程式時應理解此點，並依情境所需選擇適當的處理方式。
如果需要編修權限，請將 github 帳號告知 @gugod 。
This work is CC0.
To the extent possible under law, Kang-min Liu has waived all copyright and related or neighboring rights to Data::HanConvert. This work is published from: Taiwan.