Text::Perfide::BookCleaner - A module for processing books in plain text formats.
Quick summary of what the module does.
Perhaps a little code snippet.
use Text::Perfide::BookCleaner; my $foo = Text::Perfide::BookCleaner->new(); ...
A list of functions that can be exported. You can delete this section if you don't export anything, such as for a purely object-oriented module.
Opens a text file and returns its contents.
Optionally, the file encoding may be defined. Default encoding is UTF-8.
Removes all ^M characters.
Extracts and removes from text page breaks, headers and footers.
Removes pagenumbers + pagebreaks
Removes pagenumbers with no pagebreaks
Removes single page breaks
Counts and removes headers and footers
Detects section titles and breaks.
Detects and normalizes paragraph notation.
Detects and removes footnotes.
Several character-level operations: replacing non-ISO characters
Deals with translineations (words split across lines caused by line-wrapping) and transpaginations (same situation but for pages).
Returns a text with all changes commited (removes marks left by other functions).
Writes text in file pointed by given file descriptor (default enconding UTF8).
<jj at di.uminho.pt>
<andrefs at cpan.org>
Please report any bugs or feature requests to
bug-text-bookcleaner at rt.cpan.org, or through the web interface at http://rt.cpan.org/NoAuth/ReportBug.html?Queue=Text-BookCleaner. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.
You can find documentation for this module with the perldoc command.
You can also look for information at:
Copyright 2010 Jose Joao.
This program is free software; you can redistribute it and/or modify it under the terms of either: the GNU General Public License as published by the Free Software Foundation; or the Artistic License.
See http://dev.perl.org/licenses/ for more information.