Lee ♫ Goddard > HTML-SummaryBasic-0.2.1 > HTML::SummaryBasic

Download:
HTML-SummaryBasic-0.2.1.tar.gz

Dependencies

Annotate this POD

View/Report Bugs
Module Version: 0.2   Source  

NAME ^

HTML::SummaryBasic - Basic summary info from HTML.

SYNOPSIS ^

        use HTML::SummaryBasic;
        my $p = new HTML::SummaryBasic  {
                PATH => "input.html",
                # or HTML => '<html>...</html>',
                NOT_AVAILABLE => undef,
        };
        foreach (keys %{$p->{SUMMARY}}){
                warn "$_ ... $p->{SUMMARY}->{$_}\n";
        }

DEPENDENCIES ^

        use HTML::TokeParser;
        use HTML::HeadParser;

DESCRIPTION ^

From a file or string of HTML, creates a hash of useful summary information from meta and body elements of an HTML document.

GLOBAL VARIABLE ^

$NOT_AVAILABLE

Value for empty fields. Default is [Not Available]. May be over-ridden directly by supplying the constructor with a field of the same name. See "THE SUMMARY STRUCTURE".

CONSTRUCTOR (new) ^

Accepts a hash-like structure...

HTML or PATH

Ref to a scalar of HTML, or plain string that is the path to an HTML file to process.

SUMMARY

Filled after get_summary is called (see "METHOD get_summary" and "THE SUMMARY STRUCTURE").

FIELDS

An array of meta tag names whose content value should be placed into the respective slots of the SUMMARY field after get_summary has been called.

THE SUMMARY STRUCTURE

A field of the object which is a hash, with key/values as follows:

AUTHOR

HTML meta tag X-META-AUTHOR.

TITLE

Text of the element of the same name.

DESCRIPTION

Content of the meta tag named X-META-DESCRIPTION.

LAST_MODIFIED_META, LAST_MODIFIED_FILE

Time since of the modification of the file, respectively according to any meta tag of the same name, with a X-META- prefix; failing that, according to the file system.

CREATED_META, CREATED_FILE

As above, but relating to the creation date of the file.

FIRST_PARA

The first HTML p element of the document.

HEADLINE

The first h1 tag; failing that, the first h2; failing that, the value of $NOT_AVAILABLE.

PLUS...

Any meta-fields specified in the FIELDS field.

TODO ^

Maybe work on URI as well as file paths.

SEE ALSO ^

HTML::TokeParser, HTML::HeadParser.

AUTHOR ^

Lee Goddard (LGoddard@CPAN.org)

COPYRIGHT ^

Copyright 2000-2001 Lee Goddard.

This library is free software; you may use and redistribute it or modify it undef the same terms as Perl itself.

syntax highlighting: