NAME

Pod::HTML2Pod -- translate HTML into POD

SYNOPSIS

  # Use the program 'html2pod' that comes in this dist, or:
  use Pod::HTML2Pod;
  print Pod::HTML2Pod::convert(
    'file' => 'my_stuff.html',  # input file
    'a_href' => 1,  # try converting links
  );

DESCRIPTION

Larry Wall once said (1999-08-27, on the pod-people list, I do believe): "The whole point of pod is to get people to document stuff they wouldn't document in any other form."

To that end, I wrote this module so that people who are unpracticed with POD but in a hurry to simply document their programs or modules, could write their documentation in simple HTML, and convert that to POD. That's what this module does.

Specifically, this module bends over backwards to try to turn even vaguely plausable HTML into POD -- and when in doubt, it simply ignores things that it doesn't know about, or can't render.

FUNCTIONS

This module provides one documented function, which it does not export:

Pod::HTML2Pod::convert( ...options... )

This returns a single scalar value containing the converted POD text, with some comments after the end.

This function takes options:

'file' => FILENAME,

Specifies that the HTML code is to be read from the filename given.

'handle' => *HANDLE,

Specifies that the HTML code is to be read from the open filehandle given (e.g., $fh_obj, *HANDLE, *HANDLE{IO}, etc.) If you specify this, but fail to specify an actual handle object, inscrutible errors may result.

'content' => STRING,

Specifies that the HTML code is in the string given. (Alternately, pass a reference to the scalar: 'content' => \$stuff.)

'tree' => OBJ,

Specifies that the HTML document is contained in the given HTML::TreeBuilder object (or HTML::Element object, at least).

'a_name' => BOOLEAN,

Specifies whether you want to try converting <a name="..."> elements. By default this is off -- i.e., such elements are ignored.

'a_href' => BOOLEAN,

Specifies whether you want to try converting <a href="..."> elements. By default this is off -- i.e., such elements are ignored. If on, bear in mind that relative URLs cannot be properly converted to POD -- any relative URLs will be complained about in comments after the end of the document. Normal absolute URLs will be treated as best they can be. Note that URLs beginning "pod:..." will be turned into POD links to whatever follows; that is, "pod:Getopt::Std" is turned into L<Getopt::Std>

'debug' => INTEGER,

Puts Pod::HTML2Pod into verbose debug mode for the duration of processing this this HTML document. INTEGER can be 0 for no debug output, 1 for a moderate amount that will cause the HTML syntax tree to be be dumped at the start of the conversion, and 2 for that plus a dump of the intermediate POD doctree, plus a few more inscrutible diagnostic messages. Looking at the trees dumped might be helpful in making sense of error messages that refer to a particular node in the parse tree.

GUIDELINES

Don't write crappy HTML and expect this module to understand it.

Don't take the output of pod2html and feed it to this, just because you think it'd be neat to try it. You'll just learn really unpleasant things about Pod::Html -- and that's fine if that means you'll use it to improve Pod::Html, but it's rather the long way around.

However, do use this module to convert simple HTML into POD, bearing in mind these simple truths:

POD can't do tables, images, forms, imagemaps, layers, CSS, embedded Java applets or any other kind of object, FONT, or BLINK. So don't try to do any of these things.

Use <h1> and <h2> for headings.

If you want to have a block of literal example code, put it in a <pre>.

Keep things simple.

Remember: Just because it comes out of Pod::HTML2Pod doesn't mean it's happy normal pod. You can do lots of things in HTML that will produce POD that is strange but technically legal (like having huge and complex content in a <h1>/=head1) but that will make perldoc scream bloody murder about nroff macros stretched past their limit.

Try to avoid using a WYSIWYG HTML editor, as they often produce scary source. Ditto for taking selecting "Save as... HTML" in your word processor. You can always try it, but look at the HTML to survey the damage before you try converting it to POD.

Always look at the POD that's been output by HTML2Pod -- never just blindly include it.

Consider starting from this template:

  <html>
  <head>
   <title>Things::Stuff</title>
   <!-- html2pod ignores everything outside the body anyway -->
  </head>
  <body>
  <h1>NAME</h1>
  
  Things::Stuff -- does some things with stuff
  
  <h1>SYNOPSIS</h1>
  <!-- example code -->
  <pre>
    use HTML::Stuff;
    do some more stuff;
    la la la la la;
    oogah;
  </pre>
  
  <h1>DESCRIPTION</h1>
  
  This module does things with stuff.  It exports these functions:
  
  <dl>
  <dt><code>thingify( ... )</code>
  <dd>This function takes stuff, and returns their value as things.
  
  <dt><code>destuffulate( ... )</code>
  <dd>This function returns the things, from stuff.
   <p>It will throw a fatal exception if applied to things.
   <br>So don't do that.
  
  <dt><code>enthinction( ... )</code>
  <dd>This is where I run out of ways to make up silly sentences
   involving "thing" and "stuff".  Mostly.
  
  </dl>
  
  <h2>Caveats and WYA's</h2>
  
  Things to be wary of:
  
  <ul>
  <li>The things.
  <li>And the stuff
   <p>Don't forget about that stuff.  Gotta keep an eye on that.
  </ul>
  
  <h1>BUGS</h1>
  
  Stuff is hard.
  
  <h1>SEE ALSO</h1>
  
  <a href="pod:Class::Classless">Class::Classless</a>,
  <a href="pod:strict">strict</a>,
  <a href="pod:Lingua::EN::Numbers::Ordinate"
   >Lingua::EN::Numbers::Ordinate</a>,
  <a href="pod:perlvar">perlvar</a>,
  
  <!-- I use the secret-sauce 'pod:' scheme as a back door for making
   simple cross-references to POD man pages -->
  
  <h1>COPYRIGHT</h1>
  
  Copyright 2000, Joey Jo-Jo Jr. Shabadoo.
  
  <!-- just one suggested phrasing for the license... -->
  <p>This library is free software; you can redistribute it and/or modify
  it under the same terms as Perl itself.
  
  <h1>AUTHOR</h1>
  Joey Jo-Jo Jr. Shabadoo, <code>jojojo@shabadoo.int</code>
  </body>
  </html>

BUG REPORTS

If you do find a case where this converter misinterprets what you consider straightforward HTML (which you should really really have run thru an HTML syntax checker, by the way!), report it to me as a bug, at sburke@cpan.org.

Be sure to include the entire document that causes the error -- then specify exactly what you consider the error to be.

BUGS AND CAVEATS

* Doesn't try to turn "smart quotes" characters into simple " and '. Maybe should?

* Fails to turn

  foo thing&nbsp;bar&nbsp;baz quux

into

  foo S<thing bar baz> quux

I.e., currently just turns &nbsp;'s into normal spaces.

* Numeric entities (E<num>) are used when necessary -- but these are not understood by some older POD converters.

* No HTML that you provide will turn into F<...>

* Currently maps

  <A HREF="foo">bar</A>

to

  X<foo>bar

but is this correct?

SEE ALSO

perlpod, Pod::Html, HTML::TreeBuilder

And HTML Tidy, at http://www.w3.org/People/Raggett/tidy/

COPYRIGHT

Copyright (c) 2000 Sean M. Burke. All rights reserved.

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

AUTHOR

Sean M. Burke sburke@cpan.org

1 POD Error

The following errors were encountered while parsing the POD:

Around line 113:

Expected text after =item, not a bullet