The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.
=pod

=head1 NAME

PPI::HTML::CodeFolder - PPI::HTML Subclass providing code folding and compression

=head1 SYNOPSIS

    use strict;
    use File::Slurp ();
    use PPI;
    use PPI::HTML::CodeFolder;
    use CSS::Tiny;
    #
    # Get the file
    #
    my $file = shift @ARGV;
    $file    or die "File '$file' not provided";
    -f $file or die "File '$file' does not exist";
    #
    # Determine the output file
    #
    my $output = shift(@ARGV) || $file . '.html';
    my $foldjs = $output;
    $foldjs=~s/\.html/\.js/;
    my $foldcss = $output;
    $foldcss=~s/\.html/\.css/;
    #
    #    PPI the file
    #
    my $Document = PPI::Document->new( $file )
        or die "File '$file' could not be loaded as a document";
    #
    #    define our classname abbreviations
    #
    my %classabvs = qw(
    arrayindex ai
    backtick bt
    cast cs
    comment ct
    core co
    data dt
    double db
    end en
    heredoc hd
    heredoc_content hc
    heredoc_terminator ht
    interpolate ip
    keyword kw
    label lb
    line_number ln
    literal ll
    magic mg
    match mt
    number nm
    operator op
    pod pd
    pragma pg
    prototype pt
    readline rl
    regex re
    regexp re
    separator sp
    single sg
    structure st
    substitute su
    symbol sy
    transliterate tl
    word wd
    words wd
    );
    #
    #    define colors for the full classnames
    #
    my %tagcolors = (
    cast => '#339999',
    comment => '#008080',
    core => '#FF0000',
    double => '#999999',
    heredoc_content => '#FF0000',
    interpolate => '#999999',
    keyword => '#0000FF',
    line_number => '#666666',
    literal => '#999999',
    magic => '#0099FF',
    match => '#9900FF',
    number => '#990000',
    operator => '#DD7700',
    pod => '#008080',
    pragma => '#990000',
    regex => '#9900FF',
    single => '#999999',
    substitute => '#9900FF',
    transliterate => '#9900FF',
    word => '#999999',
    );
    #
    # Create the PPI::HTML::CodeFolder object
    #
    my $html = PPI::HTML::CodeFolder->new(
        line_numbers => 1,
        page         => 1,
        colors       => \%tagcolors,
        fold          => {
            POD           => 1,
            Comments      => 1,
            Heredocs      => 1,
            Imports       => 1,
            Abbreviate    => \%tagabvs,
            Javascript    => $jspath,
            Stylesheet    => $csspath,
            Expandable    => 1,
            },
        )
        or die "Failed to create HTML syntax highlighter";
    #
    #    collect stylesheet and javascript; we'll serve those
    #    separately
    #
    File::Slurp::write_file( $jspath, $html->foldJavascript());
    File::Slurp::write_file( $csspath, $html->foldCSS());
    #
    #	better still, just write them out
    #
    $html->writeJavascript($jspath) or die $@;
    $html->writeCSS($csspath) or die $@;
    #
    # Process the file
    #
    my $content = $html->html( $Document, $output )
        or die "Failed to generate HTML";

    File::Slurp::write_file( $output, $content );
    #
    #	write out a TOC
    #
    $html->writeTOC('toc.html') or die $@;

=head1 DESCRIPTION

A subclass of L<PPI::HTML> that compresses the generated
output by

=over 4

=item *

codefolding whitespace, POD, comments, heredocs, and imports sections, with an option to
include hyperlinks to unfold/refold the folded sections in place.

=item *

abbreviating generated C<E<lt>spanE<gt>> classnames

=item *

converting linenumber C<E<lt>spanE<gt>>'s to simple
Javascript to generate linenumbers within the browser when the document
is loaded.

=back

The amount of compression that can be achieved varies signficantly,
depending on the size and content of the source code. Gregarious modules
with lots of commentary and POD can be significantly reduced.
E.g., some simple benchmarks using the L<perl5b.pl> source (on WinXP, Perl 5.8.6):

                 Original Source:    323,204
                PPI::HTML Output:  1,008,471
    PPI::HTML::CodeFolder Output:    608,118
           (w/ expandable folds)    

As always, YMMV.

=begin html

<h3>Samples</h3>

<a href='http://www.presicient.com/ppicf/CodeFolder.html'>Folded version of CodeFolder.pm</a><br>

=end html

=head1 METHODS

=head2 $ppicf = PPI::HTML::CodeFolder->new( %args )

Same as the L<PPI::HTML> constructor, with the addition of a C<fold>
parameter, which specifies a hashref of folding properties.
If not specified, a default folding is applied (see individual
fold properties below for default behavior). In addition,
the L<PPI::HTML> C<page> property is always enabled.

B<NOTE> that using the C<css> parameter is strongly discouraged, as the
folding alignment and tooltips are very sensitive to stylesheet
changes. Instead, use the C<fold> Stylesheet option I<(see below)>
to export the CSS to an external file, and directly edit it.

=head3 Folding Properties

=head4 Abbreviate => \%abbreviations | $boolean

Specifies a mapping of standard PPI::HTML class names to an alternate
(presumably abbreviated) version. If a 'true' scalar is specified,
enables abbreviation using the default mapping; if a 'false' scalar is
specified, disables abbreviation. Default is to abbreviate.

The default abbreviation map is

    arrayindex         => ai
    backtick           => bt
    cast               => cs
    comment            => ct
    core               => co
    data               => dt
    double             => db
    end                => en
    heredoc            => hd
    heredoc_content    => hc
    heredoc_terminator => ht
    interpolate        => ip
    keyword            => kw
    label              => lb
    line_number        => ln
    literal            => ll
    magic              => mg
    match              => mt
    number             => nm
    operator           => op
    pod                => pd
    pragma             => pg
    prototype          => pt
    readline           => rl
    regex              => re
    regexp             => re
    separator          => sp
    single             => sg
    structure          => st
    substitute         => su
    symbol             => sy
    transliterate      => tl
    word               => wd
    words              => wd

Abbreviation helps compression due to the large number of
C<E<lt>spanE<gt>> tags with class specifications in the output.

B<NOTE>: any colormap provided to the constructor I<(via the color|colour parameter)>
must use the full classname, B<not> the abbreviated name.

=head4 Comments => $boolean

If a true value, comment lines are folded.
Default is to include comment lines.

=head4 Expandable => $boolean

If a true value, a hyperlink is provided in the margin for each folded source section
which unfolds the source in place when clicked; once unfolded,
the section can be refolded by clicking the hyperlinks next to the foldable
source. Default is false (i.e., folded text is simply discarded).

=head4 Heredocs => $boolean

If a true value, embedded heredoc content is folded. Default is false.

=head4 Imports => $boolean

If a true value, 'use' and 'require' statements are folded.
All statements which begin with 'use', including various pragmas,
will be folded. Default is false.

=head4 Javascript => $filename

Causes the foldtip javascript to be linked by referencing the specified C<$filename>,
rather than embedded in the output HTML.
Default is embedded; ignored if Expandable is false.
Note that this does B<not> write the specified file, but only uses
the filename in the generated E<lt>scriptE<gt> tag.

=head4 MinFoldLines => $number

Specifies the minimum number of consecutive foldable lines (of B<any> type)
required to actually apply folding. Default is 4. E.g., a value of 4 means that
3 consecutive comment lines, followed by valid statement, will B<not> be folded.

=head4 POD => $boolean

If a true value, POD lines are folded.
Default is to include POD lines.

=head4 Stylesheet => $filename

Causes color and foldtip CSS to be linked by referencing the specified C<$filename>,
rather than embedded in the output HTML. Default is embedded.
Note that this does B<not> write the specified file, but only uses
the filename in the generated E<lt>linkE<gt> tag.

=head2 $html = $ppicf->html( $src [, $output [, $script ] ] )

Generate the code folded HTML output from C<$src>, using the properties
previously specified for C<$ppicf>.

Anchors are added to both package and method declarations
which may be used via hyperlinks (e.g., from a table of contents document)
to scroll the declaration into view within a browser.

C<html()> may be repeatedly called for different sources in order to 
accumulate a cross reference containing information for all
the documents (e.g., to accumulate a single table of contents
for a multi-module project).

Parameters are

=over

=item $src I<(required)>

Either a L<PPI::Document> object, a scalar reference to the source as a string,
or the filename of the source.

=item $output I<(optional)>

The full pathname to the resulting HTML output, which is used for maintaining 
a cross reference to package and method declarations within the file. If not 
specified, and C<$src> is a filename, defaults to C<"$src.html">. Note
the C<html()> does B<not> write out the document to C<$output>, but only 
uses it for the cross reference (and any table of contents generated
from it).

=item $script I<(optional)>

A name used if C<$src> is a script file (rather than a module definition).
Script files might not include any explicit package or method declaration
which would be mapped into the cross reference. By specifying this parameter, 
an entry for the script is forced into the cross reference, so that any "main" 
package methods within the script will be assigned to this script name. If not 
specified, and C<$src> is not a filename, any methods outside of package
declarations are assigned to the "main" package.

=back

=head2 $javascript = $ppicf->foldJavascript()

Returns the Javascript required for foldtips as a string.

=head2 $css = $ppicf->foldCSS()

Returns the CSS required for the generated HTML as a string.

=head2 $rc = $ppicf->writeJavascript($path)

Writes out the Javascript required for foldtips to the file specified
by C<$path>. Returns 1 on success, or undef on failure, with an error
message in C<$@>.

=head2 $rc = $ppicf->writeCSS($path)

Writes out the CSS required for document to the file specified
by C<$path>. Returns 1 on success, or undef on failure, with an error
message in C<$@>.

=head2 $xref = $ppicf->getCrossReference()

For the prior processed source, returns the hashref mapping
the package names within the source to a hashref of 

	'URL' => <URL to anchored package definition statement>,
	'Methods' => {
		<method-name> => <URL to anchored method definition statement>,
	}

Note that the package mapping is cumulative, i.e., it is updated
each time C<html()> is called on the same PPI::HTML::CodeFolder instance.
Also note that script files (i.e.,
"main" application files) use the script filename as the package name
for any "main" package declarations or methods.

=head2 $toc = $ppicf->getTOC( $path [, Order => \@packages ] )

Return a table of contents HTML document derived from the current
cross reference. Provides hyperlinks to declared packages and their methods
in a form suitable for either embedding in a frame container, or for
processing into a tree widget via L<HTML::ListToTree>. C<$path>
specifies the pathname to which the TOC will be written, which is only
used for properly rendering the hyperlinks. C<Order> specifies
an arrayref of package (or script file) names; the position of the 
names in the list determines the position of the package/script hyperlinks
in the TOC document. Any packages/scripts not specified in C<Order> list
will be sorted alphabetically and appended after the Ordered items
(if any).

Note that method hyperlinks in the TOC are ordered alphabetically under
their parent package/script link.

=head2 $ppicf->writeTOC( $path [, Order => \@packages ] )

Calls C<getTOC()> and writes the output to the specified C<$path>.

=head2 $container = $ppicf->getFrameContainer( $title [, $home ] )

Renders an HTML frame container document to contain both a TOC
and the rendered source code. C<$title> is the title
string for the document. C<$home> specified the URL of a document 
initially loaded into the main frame (default none).

=head2 $ppicf->writeFrameContainer( $path, $title [, $home ] )

Calls C<getFrameContainer( $title [, $home ] )> and writes the 
resulting document to C<$path>/index.html.

=head1 NOTES

=over

=item *

The resulting HTML and supporting CSS and Javascripts have been
tested as follows I<(* indicates known issues, see below)>:

        Windows XP        Linux (Fedora 6 & 7)      Mac OS X
        ----------        -------------------       --------
        Firefox 1.5       Firefox 1.5               Firefox 2.0*
        Firefox 2.0       Firefox 2.0               Safari 2.4
        IE 6*             Opera 9.22 
        Opera 9.22

=item *

Support for expandable folds requires the use of non-persistent cookies,
in order to maintain the current open/close state of each fold section
when switching between source files in a framed browser display.

=item *

Internet Explorer issue: all versions from IE4 do not
properly preserve whitespace in <pre> sections
(I<see> quirksmode.org: L<http://www.quirksmode.org/bugreports/archives/2004/11/innerhtml_and_t.html>),
which requires major Javascript hacks, and causes unfolding, followed
by refolding to leave blank lines in the output. B<DON'T CLICK THE BLUE E!!!>


=item *

Firefox issue: the emitted CSS color classes B<must>
be preceded by a "dummy" class, otherwise the initial class
is ignored by Firefox.


=item *

Firefox issue: empty lines within a span
do not properly adhere to the defined line-height, which causes
misalignment with the line number margin. Therefore,
this module adds a single space to all blank lines in the output.
I<Alas, this does not fix the issue on OS X...so use Safari instead>


=item *

Developer note:
Firefox issue on OS X: a known bug causes the scrollbars of
hidden DIVs to be displayed; the fix requires the CSS for
hidden DIVs to include "overflow: hidden;". (This isn't a big deal,
since PPI::HTML::CF only uses hidden divs as text containers)


=item *

Developer note:
Firefox and Opera on Linux: Opacity settings cause performance on Linux 
to nosedive, with 100% CPU for extended periods. This is only
of concern if a popup solution is added.


=item *

Firefox + Firebug issue: Leaving Firebug enabled can severely
slow rendering of large documents with numerous folds.
Consider disabling Firebug when viewing CodeFolder
output.


=item *

The alignment of the linenumber and foldbutton margins with the
text is very sensitive to the font used, and depends heavily on the
browser and O/S used. The best combination found thus far is
"fixed, Courier", which only leaves Firefox on OS X with
alignment issues (which don't appear to be font related).
If you prefer to use another font, be aware that browser and
OS compatibility may be impacted. Also, neither of those fonts
is likely to properly support Unicode, and there don't appear
many fixed-width Unicode fonts lying about <sigh/>.


=back

=head1 SEE ALSO

L<PPI>

L<PPI::HTML>

=head1 TO DO

=over

=item *

Provide interface for app-defined margins (e.g., for annotations,
breakpoints, etc.)

=back

=head1 AUTHOR, COPYRIGHT, & LICENSE

Copyright(C) 2007, Dean Arnold, Presicient Corp., USA. All rights reserved.

Permission is granted to use this software under the terms of the
L<Perl Artistic License|perlartistic>.

=cut