The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.
######################################################################
    File::Comments 0.08
######################################################################

NAME
    File::Comments - Recognizes file formats and extracts format-specific
    comments

SYNOPSIS
        use File::Comments;

        my $snoop = File::Comments->new();

            # *----------------
            # | program.c:
            # | /* comment */
            # | main () {}
            # *----------------
        my $comments = $snoop->comments("program.c");
            # => [" comment "]

            # *----------------
            # | script.pl:
            # | # comment
            # | print "howdy!\n"; # another comment
            # *----------------
        my $comments = $snoop->comments("script.pl");
            # => [" comment", " another comment"]

            # or strip comments from a file:
        my $stripped = $snoop->stripped("script.pl");
            # => "print "howdy!\n";"

            # or just guess a file's type:
        my $type = $snoop->guess_type("program.c");    
            # => "c"

DESCRIPTION
    File::Comments guesses the type of a given file, determines the format
    used for comments, extracts all comments, and returns them as a
    reference to an array of chunks. Alternatively, it strips all comments
    from a file.

    Currently supported are Perl scripts, C/C++ programs, Java, makefiles,
    JavaScript, Python and PHP.

    The plugin architecture used by File::Comments makes it easy to add new
    formats. To support a new format, a new plugin module has to be
    installed. No modifications to the File::Comments codebase are
    necessary, new plugins will be picked up automatically.

    File::Comments can also be used to simply guess a file's type. It it
    somewhat more flexible than File::MMagic and File::Type. File types in
    File::Comments are typically based on file name suffixes (*.c, *.pl,
    etc.). If no suffix is available, or a given suffix is ambiguous (e.g.
    if several plugins have registered a handler for the same suffix), then
    the file's content is used to narrow down the possibilities and arrive
    at a decision.

    WARNING: THIS MODULE IS UNDER DEVELOPMENT, QUALITY IS ALPHA. IF YOU FIND
    BUGS, OR WANT TO CONTRIBUTE PLUGINS, PLEASE SEND THEM MY WAY.

  FILE TYPES
    Currently, the following plugins are included in the File::Comments
    distribution:

        ###############################################
        # plugin                              type    #
        ###############################################
          File::Comments::Plugin::C          c            (o)
          File::Comments::Plugin::Makefile   makefile  (X)
          File::Comments::Plugin::Perl       perl      (X)
          File::Comments::Plugin::JavaScript js           (o)
          File::Comments::Plugin::Java       java         (o)
          File::Comments::Plugin::HTML       html      (X)
          File::Comments::Plugin::Python     python       (o)
          File::Comments::Plugin::PHP        php          (o)

              (X) Fully implemented
              (o) Implemented with regular expressions, only works for
                  easy cases until real parsers are employed.

    The constants listed in the *type* column are the strings returned by
    the "guess_type()" method.

Methods
    $snoop = File::Comments->new()
        Create a new comment extractor engine. This will automatically
        initialize all plugins.

        To avoid cold calls ("Cold Calls"), set "cold_calls" to a false
        value (defaults to 1):

            $snoop = File::Comments->new( cold_calls => 0 );

        By default, if no plugin can be found for a given file,
        "File::Comments" will throw a fatal error and "die()". If this is
        undesirable and a default plugin should be used instead, it can be
        specified in the constructor using the "default_plugin" parameter:

            $snoop = File::Comments->new( 
              default_plugin => "File::Comments::Plugin::Makefile"
            );

    $comments = $snoop->comments("program.c");
        Extract all comments from a file. After determining the file type by
        either suffix or content ("Cold Calls"), comments are extracted as
        chunks and returned as a reference to an array.

        To get a single string containing all comments, just join the
        chunks:

            my $comments_string = join '', @$comments;

    $stripped_text = $snoop->stripped("program.c");
        Strip all comments from a file. After determining the file type by
        either suffix or content ("Cold Calls"), all comments are removed
        and the stripped text is returned in a scalar.

    $type = $snoop->guess_type("script.pl")
        Guess the type of a file, based on either suffix, or in absense of a
        suffix via "Cold Calls". Return the result as a string: "c",
        "makefile", "perl", etc. ("FILE TYPES").

    $snoop->suffix_registered("c")
        Returns true if one of the plugins has registered the given suffix.

  Writing new plugins
    Writing a new plugin to add functionality to the File::Comments
    framework is as simple as defining a new module, derived from the
    baseclass of all plugins, "File::Comments::Plugin". Three additional
    methods are needed: "init()", "type()", and "comments()".

    "init()" gets called when the mothership finds the plugin and
    initializes it. This is the time to register extensions that the plugin
    wants to handle.

    The second mandatory method for a plugin is "type()", which returns a
    string, indicating the type of the file examined. Usually this can be
    done without further ado, since a basic plugin will called only on files
    which it registered for by suffix. Exceptions to this are explained
    later.

    The third method is "comments()", which returns a reference to an array
    of comment lines. The content of the source file to be examined will be
    available in

        $self->{target}->{content}

    by the time "comments()" gets called.

    And that's it. Here's a functional basic plugin, registering a new
    suffix ".odd" with the mothership and expecting files with comment lines
    that start with "ODDCOMMENT":

        ###########################################
        package File::Comments::Plugin::Oddball;
        ###########################################

        use strict;
        use warnings;
        use File::Comments::Plugin;

        our $VERSION = "0.01";
        our @ISA     = qw(File::Comments::Plugin);

        ###########################################
        sub init {
        ###########################################
            my($self) = @_;
    
            $self->register_suffix(".odd");
        }

        ###########################################
        sub type {
        ###########################################
            my($self) = @_;
    
            return "odd";
        }

        ###########################################
        sub comments {
        ###########################################
            my($self) = @_;
    
            # Some code to extract all comments from 
            # $self->{target}->{content}:
            my @comments = ($self->{target}->{content} =~ /^ODDCOMMENT:(.*)/);
            return \@comments;
        }

        1;

  Cold Calls
    If a file doesn't have an extension or an extensions that's served by
    multiple plugins, File::Comments will go shop around and ask all plugins
    if they want to handle the file. The mothership calls each plugin's
    "applicable()" method, passing it an object of type
    "File::Comments::Target", which contains the following fields:

    When the plugin gets such a *cold call* (indicated by the third
    parameter to "applicable()", it can either accept or deny the request.
    To arrive at a decision, it can peek into the target object. The Perl
    plugin illustrates this:

        ###########################################
        sub applicable {
        ###########################################
            my($self, $target, $cold_call) = @_;
    
            return 1 unless $cold_call;
    
            return 1 if $target->{content} =~ /^#!.*perl\b/;

            return 0;
        }

    If a plugin does not define a "applicable()" method, a default method is
    inherited from the base class "File::Comments::Plugin", which looks like
    this:

        ###########################################
        sub applicable {
        ###########################################
            my($self, $target, $cold_call) = @_;

            return 0 if $cold_call;
            return 1;
        }

    This will deny all *cold calls* and only accept requests for files with
    suffixes or base names the plugin has already signed up for.

  Plugin Inheritance
    Plugins can reuse existing plugins by inheritance. For example, if you
    wanted to write a *catch-all* plugin that takes over all cold calls and
    handles comments like the "Makefile" plugin, you can simply use

        ###########################################
        package File::Comments::Plugin::Catchall;
        ###########################################

        use strict;
        use warnings;
        use File::Comments::Plugin;
        use File::Comments::Plugin::Makefile;

        our $VERSION = "0.01";
        our @ISA     = qw(File::Comments::Plugin::Makefile);

        ###########################################
        sub applicable {
        ###########################################
            my($self) = @_;
    
            return 1;
        }

    "File::Comments::Plugin::Catchall" just implements "applicable()" and
    inherits everything else from "File::Comments::Plugin::Makefile".

LEGALESE
    Copyright 2005 by Mike Schilli, all rights reserved. This program is
    free software, you can redistribute it and/or modify it under the same
    terms as Perl itself.

AUTHOR
    2005, Mike Schilli <cpan@perlmeister.com>