The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

MediaWiki::CleanupHTML - cleanup the MediaWiki-generated HTML from MediaWiki embellishments.

VERSION

Version 0.0.2

SYNOPSIS

    use MediaWiki::CleanupHTML;

    open my $fh, '<:encoding(UTF-8)', $filename
        or die "Cannot open '$filename' - $!";

    my $cleaner = MediaWiki::CleanupHTML->new({ fh => $fh });

    open my $out_fh, '>:encoding(UTF-8)', $processed_filename
        or die "Cannot open '$processed_filename' for output - $!";

    $cleaner->print_into_fh($out_fh);

    $cleaner->destroy_resources();

DESCRIPTION

The HTML rendered on MediaWiki pages is full of MediaWiki-specific embellishments such as edit sections. This module attempts to clean it up and return a more straightforward HTML. Note that the HTML returned by MediaWiki APIs may not always available (for instance if the wiki is down), so this module should be considered a fallback.

SUBROUTINES/METHODS

MediaWiki::CleanupHTML->new({fh => $fh})

The constructor - accepts the filehandle from which to read the XHTML.

$cleaner->print_into_fh($fh)

Output to a filehandle. The filehandle should be able to process UTF-8 output.

$cleaner->destroy_resources()

Destroy the allocated resources (of the HTML::TreeBuilder tree, etc.). Must be called before destruction.

AUTHOR

Shlomi Fish, http://www.shlomifish.org/ .

BUGS

Please report any bugs or feature requests to bug-mediawiki-cleanuphtml at rt.cpan.org, or through the web interface at http://rt.cpan.org/NoAuth/ReportBug.html?Queue=MediaWiki-CleanupHTML. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.

SUPPORT

You can find documentation for this module with the perldoc command.

    perldoc MediaWiki::CleanupHTML

You can also look for information at:

ACKNOWLEDGEMENTS

The developers of HTML::TreeBuilder::XPath, HTML::TreeBuilder and related modules for their helpful code.

LICENSE AND COPYRIGHT

Copyright 2012 Shlomi Fish.

This program is distributed under the MIT (X11) License: http://www.opensource.org/licenses/mit-license.php

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.