The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.
NAME
    HTML::HTML5::Sanity - make HTML5 DOM trees less insane

SYNOPSIS
      use HTML::HTML5::Parser;
      use HTML::HTML5::Sanity;
  
      my $parser    = HTML::HTML5::Parser->new;
      my $html5_dom = $parser->parse_file('http://example.com/');
      my $sane_dom  = fix_document($html5_dom);

DESCRIPTION
    The Document Object Model (DOM) generated by HTML::HTML5::Parser meets
    the requirements of the HTML5 spec, but will probably catch a lot of
    people by surprise.

    The main oddity is that elements and attributes which appear to be
    namespaced are not really. For example, the following element:

      <div xml:lang="fr">...</div>

    Looks like it should be parsed so that it has an attribute "lang" in the
    XML namespace. Not so. It will really be parsed as having the attribute
    "xml:lang" in the null namespace.

    "fix_document($document)"
          $sane_dom = fix_document($html5_dom);

        Returns a modified copy of the DOM and leaving the original DOM
        unmodified.

    "fix_element($element_node, $new_document_node, \%namespaces)"
        Don't use this. Not exported.

    "fix_attribute($attribute_node, $new_element_node, \%namespaces)"
        Don't use this. Not exported.

    $HTML::HTML5::Sanity::FIX_LANG_ATTRIBUTES
          $HTML::HTML5::Sanity::FIX_LANG_ATTRIBUTES = 2;
          $sane_dom = fix_document($html5_dom);

        If set to 1 (the default), the package will detect invalid values in
        @lang and @xml:lang, and remove the attribute if it is invalid. If
        set to 2, it will also attempt to canonicalise the value (e.g.
        'EN_GB' will be converted to to 'en-GB'). If set to 0, then the
        value of language attributes is not checked.

BUGS
    Please report any bugs to <http://rt.cpan.org/>.

SEE ALSO
    HTML::HTML5::Parser, XML::LibXML, Task::HTML5.

AUTHOR
    Toby Inkster <tobyink@cpan.org>.

COPYRIGHT AND LICENSE
    Copyright (C) 2009-2011 by Toby Inkster

    This library is free software; you can redistribute it and/or modify it
    under the same terms as Perl itself.