The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.
#!perl -w

# Copyright 2010, 2011, 2012, 2013, 2016, 2017 Kevin Ryde

# This file is part of PodLinkCheck.

# PodLinkCheck is free software; you can redistribute it and/or modify it
# under the terms of the GNU General Public License as published by the Free
# Software Foundation; either version 3, or (at your option) any later
# version.
#
# PodLinkCheck is distributed in the hope that it will be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
# or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
# for more details.
#
# You should have received a copy of the GNU General Public License along
# with PodLinkCheck.  If not, see <http://www.gnu.org/licenses/>.

use 5.006;
use strict;
use warnings;
use App::PodLinkCheck;

use vars '$VERSION';
$VERSION = 15;

my $plc = App::PodLinkCheck->new;
exit $plc->command_line;

__END__

=for stopwords podlinkcheck Ryde subdirs cpan Manpage manpage whitespace eg mis-interpreted SQLite bsearch lookups recognises

=head1 NAME

podlinkcheck -- check Perl pod LE<lt>E<gt> link references

=head1 SYNOPSIS

 podlinkcheck [--options] file-or-dir...

=head1 OPTIONS

The command line options are

=over 4

=item --help

Print a command line summary.

=item -I dir

Add an extra directory to look for target modules.  

=item --verbose

Print more about program operation (including CPAN loading).

=item --version

Print the program version number and exit.

=back

=head1 DESCRIPTION

PodLinkCheck parses Perl POD from a script, module or documentation and
checks that C<LE<lt>E<gt>> links within it refer to a known program, module,
or man page.

=for ProhibitVerbatimMarkup allow next

    L<foo>          check module, pod or program "foo"
    L<foo/section>    and check section within the pod
    L<bar(1)>       check man page "bar(1)"

The command line is either individual files or whole directories.  For a
directory all the F<.pl>, F<.pm> and F<.pod> files under it are checked.  So
for example to churn through all installed add-on modules,

    podlinkcheck /usr/share/perl5

Bad links are usually typos in the module name or section name, or sometimes
C<LE<lt>display|targetE<gt>> parts the wrong way around.  Occasionally there
may be an C<LE<lt>fooE<gt>> used where just markup C<CE<lt>E<gt>> or
C<IE<lt>E<gt>> was intended.

=head2 Checks

External links are checked by seeking the target F<.pm> module or F<.pod>
documentation in the C<@INC> path (per L<Pod::Find>), or seeking a script
(no file extension) in the usual executable C<PATH>.  A section name in a
link is checked by parsing the POD in the target file.

If a module is not installed in C<@INC> or extra C<-I> directories then its
existence is also checked in the CPAN indexes with C<App::cpanminus>,
C<CPAN::SQLite>, C<CPAN> or C<CPANPLUS>.  Nothing is downloaded, just
current data consulted.  A warning is given if a section name in a link goes
unchecked because it's on CPAN but not available locally.

If checking your own work then most likely you will have copies of
cross-referenced modules installed (having compared or tried them).  In that
sense the CPAN index lookups are a fallback.

Manpage links are checked by asking the C<man> program if it recognises the
name, including any number part like C<chmod(2)>.  A manpage can also
satisfy what otherwise appears to be a POD link with no sub-section, since
there's often some confusion between the two.

=head2 Internal Links

Internal links are sometimes written

    L<SYNOPSIS>                     # may be ambiguous

but the Perl 5.10 C<perlpodspec> advice is to avoid ambiguity between an
external module and a one-word internal section by writing a section with /
or quotes,

=for ProhibitVerbatimMarkup allow next 2

    See L</SYNOPSIS> above.         # good

    See L<"SYNOPSIS"> above.        # good

C<podlinkcheck> warns about C<LE<lt>SYNOPSISE<gt>> section links.  But not
if it's both an valid external module and internal section -- because it's
not uncommon to have a module name as a heading or item and an
C<LE<lt>E<gt>> link still meaning the external one.

=head2 Section Name Matching

An C<LE<lt>E<gt>> section name can use just the first word of an item or
heading.  This is how C<Pod::Checker> behaves and it's good for C<perlfunc>
cross references where just the function name can be given without the full
argument list of the C<=item>.  Eg.

=for ProhibitVerbatimMarkup allow next

    L<perlfunc/split>

The first word is everything up to the first whitespace.  This doesn't come
out very well on a target like C<=item somefun( ARG )>, but it's how
C<Pod::Checker> 1.45 behaves.  If the targets are your own then you might
make the first word or full item something sensible to appear in an
C<LE<lt>E<gt>>.

If a target section is not found then C<podlinkcheck> will try to suggest
something close, eg. differing only in punctuation or upper/lower case.
Some of the POD translators may ignore upper/lower case anyway, but it's
good to write an C<LE<lt>E<gt>> the same as the actual target.

    foo.pl:130:31: no section "constructor" in "CHI"
      (file /usr/share/perl5/CHI.pm)
      perhaps it should be "CONSTRUCTOR"

For reference, numbered C<=item> section names go in an C<LE<lt>E<gt>>
without the number.  This is good since the numbering might change.  If
C<podlinkcheck> suggests a number in a target then it may be a mistake in
the target document.  A numbered item should have the number alone on the
C<=item> and the section name as the next paragraph.

    =item 1.                        # good

    The First Thing                 # the section name

    Paragraph about this thing.

    =item 2. The Second Thing       # bad

    Paragraph about this next thing.

The second item "2. The Second Thing" is not a numbered item for POD
purposes, but rather text that happens to start with a number.  Of course
sometimes that's what you want, eg.

    =item 64 Bit Support

C<podlinkcheck> uses C<Pod::Simple> for parsing and so follows its
interpretation of the various hairy C<LE<lt>E<gt>> link forms.  If an
C<LE<lt>E<gt>> appears to be mis-interpreted you might rewrite or add some
escapes (like EE<lt>solE<gt>) for the benefit of all translators which use
C<Pod::Simple>.  In Perl 5.10 that includes the basic C<pod2man>.

=head2 Other Ways to Do It

C<podchecker> (the C<Pod::Checker> module) checks internal links, but it
doesn't check external links.

C<Test::Pod::LinkCheck> is similar in a F<.t> test framework.  It uses some
of PodLinkCheck but different reporting and a stricter approach to dubious
POD.

=head1 EXIT STATUS

Exit is 0 for no problems found, or non-zero for problems.

=head1 ENVIRONMENT VARIABLES

=over 4

=item C<PATH>

The search path for installed scripts.

=item C<HOME>

Used by the various C<CPAN> modules for C<~/.cpan> etc directories.

=item C<PERL5LIB>

The usual extra Perl module directories (see L<perlrun/ENVIRONMENT>), which
become C<@INC> where link targets are sought.

=back

=head1 BUGS

C<App::cpanminus> is checked first since it's a bsearch of
F<02packages.details.txt>, and C<CPAN::SQLite> second since it's a database
lookup.  But if a target is not found there then the full C<CPAN> and
C<CPANPLUS> caches are loaded and checked.  This might use a fair bit of
memory for a non-existent target, but it's also possible they're more
up-to-date.

No attempt is made to tell which of the indexes is the most up-to-date.  If
a module has been renamed (bad) then it may still exist in an old index.
The suggestion is to avoid having old stuff lying around (including old
mirror files in C<App::cpanminus>).

The code consulting C<CPAN.pm> may need a tolerably new version of that
module, maybe 1.61 circa Perl 5.8.0.  On earlier versions its index is not
used.

The line:column number reported for an offending C<LE<lt>E<gt>> is found by
some gambits extending what C<Pod::Simple> normally records.  There's a
chance it could be a little off within the paragraph.

C<Pod::Simple> prior to version 3.24 didn't allow dots "." in man-page
names, resulting in for example L<login.conf(5)> being treated as a Perl
module name not a man page name.  If you have such links then use
C<Pod::Simple> 3.24 up.

Directories are currently traversed using L<File::Find::Iterator>.  It
follows symlinks but neither its version 0.4 nor PodLinkCheck guard against
infinite descent into symlink cycles.  The intention perhaps would be follow
all symlinks to files, but follow to a directory just once as protection
against cycles.

=head1 FILES

F<~/.cpanm/sources/*/02packages.details.txt> files from C<App::cpanminus>

F<~/.cpan/cpandb.sql> used by C<CPAN::SQLite>

F<~/.cpan/Metadata> used by C<CPAN>

F<~/.cpanplus/*> variously used by C<CPANPLUS>

=head1 SEE ALSO

L<podchecker>, L<podlint>

L<Pod::Simple>, L<Pod::Find>, L<CPAN>, L<CPAN::SQLite>,
L<CPANPLUS>, L<cpanm>

L<Test::Pod::LinkCheck>, L<Pod::Checker>, L<Test::Pod>

=head1 HOME PAGE

http://user42.tuxfamily.org/podlinkcheck/index.html

=head1 LICENSE

Copyright 2010, 2011, 2012, 2013, 2016, 2017 Kevin Ryde

PodLinkCheck is free software; you can redistribute it and/or modify it
under the terms of the GNU General Public License as published by the Free
Software Foundation; either version 3, or (at your option) any later
version.

PodLinkCheck is distributed in the hope that it will be useful, but WITHOUT
ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
more details.

You should have received a copy of the GNU General Public License along with
PodLinkCheck.  If not, see <http://www.gnu.org/licenses/>.

=cut