The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.
NAME
    HTML::HTML5::Sanity - make HTML5 DOM trees less insane

SYNOPSIS
      use HTML::HTML5::Parser;
      use HTML::HTML5::Sanity;
  
      my $parser    = HTML::HTML5::Parser->new;
      my $html5_dom = $parser->parse_file('http://example.com/');
      my $sane_dom  = fix_document($html5_dom);

DESCRIPTION
    The Document Object Model (DOM) generated by HTML::HTML5::Parser meets
    the requirements of the HTML5 spec, but will probably catch a lot of
    people by surprise.

    The main oddity is that elements and attributes which appear to be
    namespaced are not really. For example, the following element:

      <div xml:lang="fr">...</div>

    Looks like it should be parsed so that it has an attribute "lang" in the
    XML namespace. Not so. It will really be parsed as having the attribute
    "xml:lang" in the null namespace.

    "fix_document($document)"
          $sane_dom = fix_document($html5_dom);

        Returns a modified copy of the DOM and leaving the original DOM
        unmodified.

    "fix_element($element_node, $new_document_node, \%namespaces)"
        Don't use this. Not exported.

    "fix_attribute($attribute_node, $new_element_node, \%namespaces)"
        Don't use this. Not exported.

    $HTML::HTML5::Sanity::FIX_LANG_ATTRIBUTES
          $HTML::HTML5::Sanity::FIX_LANG_ATTRIBUTES = 2;
          $sane_dom = fix_document($html5_dom);

        If set to 1 (the default), the package will detect invalid values in
        @lang and @xml:lang, and remove the attribute if it is invalid. If
        set to 2, it will also attempt to canonicalise the value (e.g.
        'EN_GB' will be converted to to 'en-GB'). If set to 0, then the
        value of language attributes is not checked.

BUGS
    Please report any bugs to <http://rt.cpan.org/>.

SEE ALSO
    HTML::HTML5::Parser, XML::LibXML, Task::HTML5.

AUTHOR
    Toby Inkster <tobyink@cpan.org>.

COPYRIGHT AND LICENSE
    Copyright (C) 2009-2013 by Toby Inkster

    This library is free software; you can redistribute it and/or modify it
    under the same terms as Perl itself.

DISCLAIMER OF WARRANTIES
    THIS PACKAGE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED
    WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF
    MERCHANTIBILITY AND FITNESS FOR A PARTICULAR PURPOSE.