Lingua::ZH::Summarize - Summarizing bodies of Chinese text
use Lingua::ZH::Summarize; print summarize( $text ); # Easy, no? :-) print summarize( $text, maxlength => 500 ); # 500-byte summary print summarize( $text, wrap => 75 ); # Wrap output to 75 col.
This is a simple module which makes an unscientific effort at summarizing Chinese text. It recognizes simple patterns which look like statements, abridges them, and concatenates them into something vaguely resembling a summary. It needs more work on large bodies of text, but it seems to have a decent effect on small inputs at the moment.
Lingua::ZH::Summarize exports one function, summarize(), which takes the text to summarize as its first argument, and any number of optional directives in name => value form. The options it'll take are:
summarize()
name => value
Specifies the maximum length, in bytes, of the generated summary.
Prettyprints the summary output by wrapping it to the number of columns which you specify. This requires the Lingua::ZH::Wrap module.
Needless to say, this is a very simple and not terribly universally effective scheme, but it's good enough for a first draft, and I'll bang on it more later. Like I said, it's not a scientific approach to the problem, but it's better than nothing.
Lingua::ZH::Toke, Lingua::ZH::Wrap, Lingua::EN::Summarize
Algorithm adapted from the Lingua::EN::Summarize module by Dennis Taylor, <dennis@funkplanet.com>.
Autrijus Tang <autrijus@autrijus.org>
Copyright 2003 by Autrijus Tang <autrijus@autrijus.org>.
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
See http://www.perl.com/perl/misc/Artistic.html
To install Lingua::ZH::Summarize, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Lingua::ZH::Summarize
CPAN shell
perl -MCPAN -e shell install Lingua::ZH::Summarize
For more information on module installation, please visit the detailed CPAN module installation guide.