The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

==== NAME ====

html2dbk - convert XHTML to DocBook.


==== VERSION ====

This describes version ``0.03'' of html2dbk.


==== DESCRIPTION ====

This script (and module) converts an XHTML file into DocBook, using both
XSLT and heuristics (as XSLT alone can't do everything).

This script will convert "*filename*.html" into "*filename*.xml"

By default, the input file is expected to be correct XML (there are other
programs such as html tidy (http://tidy.sourceforge.net/) which can correct
files for you; this does not do that). If you give the --html option then
this will attempt to parse the file as HTML.

Note also this is very simple; it doesn't deal with things like <div> or
<span> which it has no way of guessing the meaning of. This does not merge
multiple XHTML files into a single document, so this converts each XHTML
file into a <chapter>, with each header being a section (sect1 to sect5).
The <title> tag is used for the chapter title.

There will likely to be validity errors, depending on how good the original
HTML was. There may be broken links, <xref> elements that should be <link>s,
and overuse of <emphasis> and <emphasis role="bold">.


==== REQUIRES ====

    Getopt::Long
    Pod::Usage
    Getopt::ArgvFile
    HTML::ToDocBook
        Cwd
        File::Basename
        File::Spec
        XML::LibXML
        XML::LibXSLT
        HTML::SimpleParse


==== AUTHOR ====

    Kathryn Andersen (RUBYKAT)
    perlkat AT katspace dot com
    http://www.katspace.org/tools


==== COPYRIGHT AND LICENCE ====

Copyright (c) 2006 by Kathryn Andersen

This program is free software; you can redistribute it and/or modify it
under the same terms as Perl itself.