==== NAME ====
html2dbk - convert XHTML to DocBook.
==== VERSION ====
This describes version ``0.03'' of html2dbk.
==== DESCRIPTION ====
This script (and module) converts an XHTML file into DocBook, using both
XSLT and heuristics (as XSLT alone can't do everything).
This script will convert "*filename*.html" into "*filename*.xml"
By default, the input file is expected to be correct XML (there are other
programs such as html tidy (http://tidy.sourceforge.net/) which can correct
files for you; this does not do that). If you give the --html option then
this will attempt to parse the file as HTML.
Note also this is very simple; it doesn't deal with things like <div> or
<span> which it has no way of guessing the meaning of. This does not merge
multiple XHTML files into a single document, so this converts each XHTML
file into a <chapter>, with each header being a section (sect1 to sect5).
The <title> tag is used for the chapter title.
There will likely to be validity errors, depending on how good the original
HTML was. There may be broken links, <xref> elements that should be <link>s,
and overuse of <emphasis> and <emphasis role="bold">.
==== REQUIRES ====
Getopt::Long
Pod::Usage
Getopt::ArgvFile
HTML::ToDocBook
Cwd
File::Basename
File::Spec
XML::LibXML
XML::LibXSLT
HTML::SimpleParse
==== AUTHOR ====
Kathryn Andersen (RUBYKAT)
perlkat AT katspace dot com
http://www.katspace.org/tools
==== COPYRIGHT AND LICENCE ====
Copyright (c) 2006 by Kathryn Andersen
This program is free software; you can redistribute it and/or modify it
under the same terms as Perl itself.