View on
Benjamin Trott > WWW-Blog-Metadata-0.03 > WWW::Blog::Metadata



Annotate this POD


New  1
Open  1
View/Report Bugs
Module Version: 0.03   Source  


WWW::Blog::Metadata - Extract common metadata from weblogs


    use WWW::Blog::Metadata;
    use Data::Dumper;
    my $uri;
    my $meta = WWW::Blog::Metadata->extract_from_uri($uri)
        or die WWW::Blog::Metadata->errstr;
    print Dumper $meta;


WWW::Blog::Metadata extracts common metadata from weblogs: syndication feed URIs, FOAF URIs, locative information, etc. Some benefits of using WWW::Blog::Metadata:



Given a URI $uri pointing to a weblog, fetches the page contents, and attempts to extract common metadata from that weblog.

On error, returns undef, and the error message can be obtained by calling WWW::Blog::Metadata->errstr.

On success, returns a WWW::Blog::Metadata object.

WWW::Blog::Metadata->extract_from_html($html [, $base_uri ])

Uses the same extraction mechanism as extract_from_uri, but assumes that you've already fetched the HTML document and will provide it in $html, which should be a reference to a scalar containing the HTML.

If you know the base URI of the document, you should provide it in $base_uri. WWW::Blog::Metadata will attempt to find the base URI of the document if it's specified in the HTML itself, but you can give it a head start by passing in $base_uri.

This method has the same return value as extract_from_uri.


A reference to a hash of syndication feed URIs.

(Note: these are currently extracted using Feed::Find, which requires a separate parsing step, and sort of renders the above benefit #1 somewhat of a lie. This is done for maximum correctness, but it's possible it could change at some point.)


The URI for a FOAF file, specified in the standard manner used for FOAF auto-discovery.



The latitude and longitude specified for the weblog, from either icbm or geo.position <meta /> tags.


The tool that generated the weblog, from a generator <meta /> tag.


There are endless amounts of metadata that you might want to extract from a weblog, and the methods above are only what are provided by default. If you'd like to extract more information, you can use WWW::Blog::Metadata's plugin architecture to build access to the metadata that you want, while while making only one parsing pass over the HTML document.

The plugin architecture uses Module::Pluggable::Ordered, and it provides 2 pluggable events:


WWW::Blog::Metadata is free software; you may redistribute it and/or modify it under the same terms as Perl itself.


Except where otherwise noted, WWW::Blog::Metadata is Copyright 2005 Benjamin Trott, All rights reserved.

syntax highlighting: