HTML::FormatText::Lynx - format HTML as plain text using lynx
use HTML::FormatText::Lynx; $text = HTML::FormatText::Lynx->format_file ($filename); $text = HTML::FormatText::Lynx->format_string ($html_string); $formatter = HTML::FormatText::Lynx->new (rightmargin => 60); $tree = HTML::TreeBuilder->new_from_file ($filename); $text = $formatter->format ($tree);
HTML::FormatText::Lynx turns HTML into plain text using the
The module interface is compatible with formatters like
HTML::FormatText, but all parsing etc is done by lynx.
HTML::FormatExternal for the formatting functions and options, all of which are supported by
HTML::FormatText::Lynx, with the following caveats
Prior to the
-nomargins option of Lynx 2.8.6dev.12 (June 2005) an additional 3 space margin is always applied within the requested left and right positions.
Note that "latin-1" etc is not accepted, it must be "iso-8859-1" etc.
output_charset becomes the
-display_charset option and can't be used on very old
lynx which doesn't have that option (eg. lynx circa 2.8.1). Perhaps in the future
output_charset could be dropped if it's already what will be output, or throw a Perl error when unsupported.
If true then
-justify is passed to lynx to get all lines in the paragraph padded out with extra spaces to the given
rightmargin (or default right margin).
Copyright 2008, 2009, 2010 Kevin Ryde
HTML-FormatExternal is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 3, or (at your option) any later version.
HTML-FormatExternal is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with HTML-FormatExternal. If not, see <http://www.gnu.org/licenses/>.