☺唐鳳☻ > Lingua-ZH-Summarize > Lingua::ZH::Summarize

Download:
Lingua-ZH-Summarize-0.01.tar.gz

Dependencies

Annotate this POD

View/Report Bugs
Module Version: 0.01   Source  

NAME ^

Lingua::ZH::Summarize - Summarizing bodies of Chinese text

SYNOPSIS ^

    use Lingua::ZH::Summarize;

    print summarize( $text );                    # Easy, no? :-)
    print summarize( $text, maxlength => 500 );  # 500-byte summary
    print summarize( $text, wrap => 75 );        # Wrap output to 75 col.

DESCRIPTION ^

This is a simple module which makes an unscientific effort at summarizing Chinese text. It recognizes simple patterns which look like statements, abridges them, and concatenates them into something vaguely resembling a summary. It needs more work on large bodies of text, but it seems to have a decent effect on small inputs at the moment.

Lingua::ZH::Summarize exports one function, summarize(), which takes the text to summarize as its first argument, and any number of optional directives in name => value form. The options it'll take are:

maxlength

Specifies the maximum length, in bytes, of the generated summary.

wrap

Prettyprints the summary output by wrapping it to the number of columns which you specify. This requires the Lingua::ZH::Wrap module.

Needless to say, this is a very simple and not terribly universally effective scheme, but it's good enough for a first draft, and I'll bang on it more later. Like I said, it's not a scientific approach to the problem, but it's better than nothing.

SEE ALSO ^

Lingua::ZH::Toke, Lingua::ZH::Wrap, Lingua::EN::Summarize

ACKNOWLEDGEMENTS ^

Algorithm adapted from the Lingua::EN::Summarize module by Dennis Taylor, <dennis@funkplanet.com>.

AUTHORS ^

Autrijus Tang <autrijus@autrijus.org>

COPYRIGHT ^

Copyright 2003 by Autrijus Tang <autrijus@autrijus.org>.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

See http://www.perl.com/perl/misc/Artistic.html

syntax highlighting: