Benjamin Trott > WWW-Blog-Metadata > WWW::Blog::Metadata

Download:
WWW-Blog-Metadata-0.03.tar.gz

Dependencies

Annotate this POD

CPAN RT

Open  1
View/Report Bugs
Module Version: 0.03   Source  

NAME ^

WWW::Blog::Metadata - Extract common metadata from weblogs

SYNOPSIS ^

    use WWW::Blog::Metadata;
    use Data::Dumper;
    my $uri;
    my $meta = WWW::Blog::Metadata->extract_from_uri($uri)
        or die WWW::Blog::Metadata->errstr;
    print Dumper $meta;

DESCRIPTION ^

WWW::Blog::Metadata extracts common metadata from weblogs: syndication feed URIs, FOAF URIs, locative information, etc. Some benefits of using WWW::Blog::Metadata:

USAGE ^

WWW::Blog::Metadata->extract_from_uri($uri)

Given a URI $uri pointing to a weblog, fetches the page contents, and attempts to extract common metadata from that weblog.

On error, returns undef, and the error message can be obtained by calling WWW::Blog::Metadata->errstr.

On success, returns a WWW::Blog::Metadata object.

WWW::Blog::Metadata->extract_from_html($html [, $base_uri ])

Uses the same extraction mechanism as extract_from_uri, but assumes that you've already fetched the HTML document and will provide it in $html, which should be a reference to a scalar containing the HTML.

If you know the base URI of the document, you should provide it in $base_uri. WWW::Blog::Metadata will attempt to find the base URI of the document if it's specified in the HTML itself, but you can give it a head start by passing in $base_uri.

This method has the same return value as extract_from_uri.

$meta->feeds

A reference to a hash of syndication feed URIs.

(Note: these are currently extracted using Feed::Find, which requires a separate parsing step, and sort of renders the above benefit #1 somewhat of a lie. This is done for maximum correctness, but it's possible it could change at some point.)

$meta->foaf_uri

The URI for a FOAF file, specified in the standard manner used for FOAF auto-discovery.

$meta->lat

$meta->lon

The latitude and longitude specified for the weblog, from either icbm or geo.position <meta /> tags.

$meta->generator

The tool that generated the weblog, from a generator <meta /> tag.

PLUGINS ^

There are endless amounts of metadata that you might want to extract from a weblog, and the methods above are only what are provided by default. If you'd like to extract more information, you can use WWW::Blog::Metadata's plugin architecture to build access to the metadata that you want, while while making only one parsing pass over the HTML document.

The plugin architecture uses Module::Pluggable::Ordered, and it provides 2 pluggable events:

LICENSE ^

WWW::Blog::Metadata is free software; you may redistribute it and/or modify it under the same terms as Perl itself.

AUTHOR & COPYRIGHT ^

Except where otherwise noted, WWW::Blog::Metadata is Copyright 2005 Benjamin Trott, ben+cpan@stupidfool.org. All rights reserved.

syntax highlighting: