HTML2XHTML - Wrapper to command-line program that converts from HTML 3.x/4.x to XHTML 1.0 Strict|Transitional
use HTML2XHTML;
OR
require HTML2XHTML;
To convert an HTML file and no external CSS file to XHTML:
my $foo = new HTML2XHTML(encoding => 'UTF-8', file_name => '..//foo//foo.html', doctype => 'Strict', direction => 'ltr', lang => 'en'); my $foo = new HTML2XHTML(encoding => 'ISO 8859-1', file => '..\foo\foo.html', doctype => 'Strict', direction => 'ltr', lang => 'en'); my $foo = new HTML2XHTML(encoding => 'ISO 8859-1', file => 'C://My Documents//foo//foo.html', doctype => 'Transitional', direction => 'RTL', lang => 'es');
my $foo = new HTML2XHTML(encoding => 'ISO 8859-1', file_name => '../foo/foo.html', doctype => 'Strict', direction => 'ltr', lang => 'en');
To convert HTML files and external CSS files to XHTML:
my $foo = new HTML2XHTML(encoding => 'UTF-16', file_name => 'foo.html,foo.css,foo2.html,foo3.html,bar.css', doctype => 'Strict', direction => 'ltr', lang => 'en');
To convert a directory to XHTML:
my $foo = new HTML2XHTML(encoding => 'UTF-8', dir => '../foo', doctype => 'Strict', direction => 'ltr', lang => 'en');
To convert the current directory of HTML files and external CSS files to XHTML:
my $foo = new HTML2XHTML(encoding => 'UTF-8', dir => 'current', doctype => 'Strict', direction => 'ltr', lang => 'en');
HTML2XHTML was my first attempt at writing a module and distribution...after more than a year it finally has been updated with bug fixes and improvements. It is a pure Perl implementation written to convert an HTML 4.01 Transitional page to XHTML 1.0 Strict or Transitional. There is now a report of deprecated tags with resources given to validate your Web page(s).
There are 2 scripts included in the distribution, the command-line program and its OO wrapper.
For those interested in the command-line options, see below:
perl convert_xhtml.pl encoding doctype lang direction dir [relative|absolute (directory path)|'current']
You can specify a whole directory to be converted either by putting 'dir' or 'directory' after the name of the command-line script followed by the name of the directory.
perl convert_xhtml.pl ISO 8859-1 Strict en ltr dir current perl convert_xhtml.pl UTF-8 Strict en ltr directory ../foo/
You can also specify an HTML document, CSS document, or multiple documents (including the directory path if not in the same directory as the program) to convert followed by commas.
perl convert_xhtml.pl encoding doctype lang direction [relative|absolute directory path] file perl convert_xhtml.pl UTF-8 Strict en rtl ../bar/foo.html, bar.css
my $foo = new HTML2XHTML(encoding => 'ISO 8859-1', file_name => '..//foo//foo.html', doctype => 'Strict', direction => 'ltr', lang => 'en');
This method invokes the command-line program with the options passed. The options can either be a list of file names (either absolute or relative path) separated by commas, or the name of a directory.
For files, there is an abbreviated form of the key, which is 'file', if you do not wish to write 'file_name'.
This module requires Cwd, File::Flock, and Perl 5.001 or later.
None by default.
http://www.w3.org/TR/xhtml1/ XHTML 1.0 The Extensible HyperText Markup Language (Second Edition)
http://search.cpan.org/~kwilliams/PathTools-3.23/Cwd.pm Cwd
http://search.cpan.org/~muir/File-Flock-104.111901/lib/File/Flock.pm File::Flock
http://en.wikipedia.org/wiki/Character_encoding (Character) Encoding
Obiora Embry
Copyright (C) 2006-2008 by Obiora Embry.
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.7 or at your option, any later version of Perl 5 you may have available.
2 POD Errors
The following errors were encountered while parsing the POD:
Unterminated B<...> sequence
To install HTML2XHTML, copy and paste the appropriate command in to your terminal.
cpanm
cpanm HTML2XHTML
CPAN shell
perl -MCPAN -e shell install HTML2XHTML
For more information on module installation, please visit the detailed CPAN module installation guide.