Toby Inkster > HTML-HTML5-Sanity-0.105 > HTML::HTML5::Sanity

Download:
HTML-HTML5-Sanity-0.105.tar.gz

Dependencies

Annotate this POD

Website

CPAN RT

Open  0
View/Report Bugs
Module Version: 0.105   Source  

NAME ^

HTML::HTML5::Sanity - make HTML5 DOM trees less insane

SYNOPSIS ^

  use HTML::HTML5::Parser;
  use HTML::HTML5::Sanity;
  
  my $parser    = HTML::HTML5::Parser->new;
  my $html5_dom = $parser->parse_file('http://example.com/');
  my $sane_dom  = fix_document($html5_dom);

DESCRIPTION ^

The Document Object Model (DOM) generated by HTML::HTML5::Parser meets the requirements of the HTML5 spec, but will probably catch a lot of people by surprise.

The main oddity is that elements and attributes which appear to be namespaced are not really. For example, the following element:

  <div xml:lang="fr">...</div>

Looks like it should be parsed so that it has an attribute "lang" in the XML namespace. Not so. It will really be parsed as having the attribute "xml:lang" in the null namespace.

fix_document($document)
  $sane_dom = fix_document($html5_dom);

Returns a modified copy of the DOM and leaving the original DOM unmodified.

fix_element($element_node, $new_document_node, \%namespaces)

Don't use this. Not exported.

fix_attribute($attribute_node, $new_element_node, \%namespaces)

Don't use this. Not exported.

$HTML::HTML5::Sanity::FIX_LANG_ATTRIBUTES
  $HTML::HTML5::Sanity::FIX_LANG_ATTRIBUTES = 2;
  $sane_dom = fix_document($html5_dom);

If set to 1 (the default), the package will detect invalid values in @lang and @xml:lang, and remove the attribute if it is invalid. If set to 2, it will also attempt to canonicalise the value (e.g. 'EN_GB' will be converted to to 'en-GB'). If set to 0, then the value of language attributes is not checked.

BUGS ^

Please report any bugs to http://rt.cpan.org/.

SEE ALSO ^

HTML::HTML5::Parser, XML::LibXML, Task::HTML5.

AUTHOR ^

Toby Inkster <tobyink@cpan.org>.

COPYRIGHT AND LICENSE ^

Copyright (C) 2009-2014 by Toby Inkster

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

DISCLAIMER OF WARRANTIES ^

THIS PACKAGE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTIBILITY AND FITNESS FOR A PARTICULAR PURPOSE.

syntax highlighting: