
Lingua::ZH::Summarize - Summarizing bodies of Chinese text

use Lingua::ZH::Summarize;
print summarize( $text ); # Easy, no? :-)
print summarize( $text, maxlength => 500 ); # 500-byte summary
print summarize( $text, wrap => 75 ); # Wrap output to 75 col.

This is a simple module which makes an unscientific effort at summarizing Chinese text. It recognizes simple patterns which look like statements, abridges them, and concatenates them into something vaguely resembling a summary. It needs more work on large bodies of text, but it seems to have a decent effect on small inputs at the moment.
Lingua::ZH::Summarize exports one function, summarize(), which takes the text to summarize as its first argument, and any number of optional directives in name => value form. The options it'll take are:
Specifies the maximum length, in bytes, of the generated summary.
Prettyprints the summary output by wrapping it to the number of columns which you specify. This requires the Lingua::ZH::Wrap module.
Needless to say, this is a very simple and not terribly universally effective scheme, but it's good enough for a first draft, and I'll bang on it more later. Like I said, it's not a scientific approach to the problem, but it's better than nothing.

Lingua::ZH::Toke, Lingua::ZH::Wrap, Lingua::EN::Summarize

Algorithm adapted from the Lingua::EN::Summarize module by Dennis Taylor, <dennis@funkplanet.com>.

Autrijus Tang <autrijus@autrijus.org>

Copyright 2003 by Autrijus Tang <autrijus@autrijus.org>.
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.