Ave Wrigley > html2text > html2text.pl

Download:
html2text-0.003.tar.gz

Annotate this POD

CPAN RT

New  1
Open  0
View/Report Bugs
Source  

NAME ^

html2text.pl - script for generating formatted text from HTML

SYNOPSIS ^

    html2text.pl <filename>
    cat <filename> | html2text.pl

DESCRIPTION ^

html2text.pl generated simple formatted text from HTML. It uses HTML::Element to traverse an HTML tree built by HTML::TreeBuilder, and formats the output text using Text::Format. It is very simple at the moment. The type of things it does are:

Headings

All headings are underlined. <H1>s are double underlined. Headings are numbered, by using the heading levels, and previous heading levels.

Paragraphs

Paragraph text is formatted with the paragraph method of Text::Format.

Lists

List items are indented by 4 spaces, and preceded with an asterisk.

Definition Lists

<DT>s are intented by 4 spaces; <DD>s are indented by 8 spaces.

PREREQUISITES ^

Text::Format HTML::TreeBuilder

OSNAMES ^

any

AUTHOR ^

Ave Wrigley <Ave.Wrigley@itn.co.uk>

COPYRIGHT ^

This script is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

SCRIPT CATEGORIES ^

Web

syntax highlighting: