Tony G. Rose > HTML-Summary-0.017 > Text::Sentence

Download:
HTML-Summary-0.017.tar.gz

Dependencies

Annotate this POD

CPAN RT

New  1
Open  0
View/Report Bugs
Module Version: 0.006   Source   Latest Release: HTML-Summary-0.019

NAME ^

Text::Sentence - module for splitting text into sentences

SYNOPSIS ^

    use Text::Sentence qw( split_sentences );
    use locale;
    use POSIX qw( locale_h );

    setlocale( LC_CTYPE, 'iso_8859_1' );
    @sentences = split_sentences( $text );

DESCRIPTION ^

The Text::Sentence module contains the function split_sentences, which splits text into its constituent sentences, based on a fairly approximate regex. If you set the locale before calling it, it will deal correctly with locale dependant capitalization to identify sentence boundaries. Certain well know exceptions, such as abreviations, may cause incorrect segmentations.

FUNCTIONS ^

split_sentences( $text )

The split sentences function takes a scalar containing ascii text as an argument and returns an array of sentences that the text has been split into.

    @sentences = split_sentences( $text );

SEE ALSO ^

    locale
    POSIX

AUTHOR ^

Ave Wrigley <wrigley@cre.canon.co.uk>

COPYRIGHT ^

Copyright (c) 1997 Canon Research Centre Europe (CRE). All rights reserved. This script and any associated documentation or files cannot be distributed outside of CRE without express prior permission from CRE.

syntax highlighting: