The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

chklinks - A non-threaded Perl link checker

SYNOPSIS

   chklinks [options] URL1 [URL2 [UR3 ...]]

DESCRIPTION

chklinks is a non-threaded Perl link checker. It helps finding broken links on your website.

chklinks differs from linkchecker in that chklinks is non- threaded. It does not raises many simultaneously connections for its job. It won't run out of the resources and crash your system in a moment. This is certainly more desirable for most webmasters and users.

chklinks follows robots.txt rules. If you disallow robots from your website and experience problems, you need to allow chklinks. Add the following lines to your robots.txt file to allow chklinks:

  User-agent: chklinks
  Disallow:

chklinks uses LWP::RobotUA(3) and support the following schemes: http, https, ftp, gopher and file. You can also specify a local file. (To use https, you need to install Crypt::SSLeay(3). This is the requirement of LWP::RobotUA(3).)

chklinks supports cookies.

OPTIONS

-1,--onelevel

Check the links on this page and stops.

-r,--recursive

Recursively check through this website. This is the default.

-b,--below

Only check the links below this directory. This is the default.

-p,--parent

Trace back to the parent directories.

-l,--local

Only check the links on this same host.

-s,--span

Check the links to other hosts (without recursion). This is the default.

-e,--exclude path

Exclude this path. Check for their existence but not check the links on them, just like they are on a foreign site. Multiple --exclude are OK.

-i,--include path

Include this path. An opposite of --exclude that cancels its effect. The latter specified has a higher priority.

-d,--debug

Display debug messages. Multiple --debug to debug more.

-q,--quiet

Disable debug messages. An opposite that cancels the effect of --debug.

-h,--help

Display the help message and exit.

-v,--version

Output version information and exit.

URL1, URL2, URL3

The URLs of the websites to check against.

NOTES

chklinks does not obey Crawl-delay: in robots.txt yet. This is a problem in WWW::RobotRules(3), but not chklinks itself.

If you encounter warnings like this:

    Parsing of undecoded UTF-8 will give garbage when decoding
    entities at /usr/share/perl5/LWP/Protocol.pm line 114.

This is a LWP::Protocol(3) issue when working with HTML::Parser(3) version >= 3.40. See CPAN RT Bug#20274 http://rt.cpan.org/Public/Bug/Display.html?id=20274 for a LWP::Protocol(3) patch on this.

BUGS

chklinks does not support authentication yet. W3C-LinkChecker have support on this. As a workaround, You can use the syntax http://user:pass@some.where.com/some/path for Basic Authentication, but this does not work on Digest Authentication. This practice is not encouraged. Your password would be visable to anyone on this system using ps, including hidden intruders. Also what you type in your shell will be saved to your shell history file.

mailto: URLs should be supported by checking the validity of its DNS/MX record. Bastian Kleineidam's linkchecker(1) have support on this.

Local file checking has only been tested on Unix and MSWin32. More platforms should be tested, especially VMS and Mac.

SUPPORT

chklinks is hosted on SourceForge, CPAN and Tavern IMACAT's. For the latest infomation, see http://chklinks.sourceforge.net/ , http://sourceforge.net/projects/chklinks/ , http://search.cpan.org/dist/chklinks/ or http://www.imacat.idv.tw/tech/chklinks.html .

chklinks has a user discussion mailing list hosted at SourceForge: chklinks-users@lists.sourceforge.net . It is a mailman mailing list. For infomation on how to join or leave, see: http://lists.sourceforge.net/lists/listinfo/chklinks-users . Alternatively, you can send a mail to: chklinks-users-request@lists.sourceforge.net with the subject help for a list of available commands.

SEE ALSO

LWP::UserAgent(3), LWP::RobotUA(3), WWW::RobotRules(3), URI(3), HTML::LinkExtor(3), Bastian Kleineidam's linkchecker(1), W3C-LinkChecker checklink(1).

AUTHOR

imacat <imacat@mail.imacat.idv.tw>.

COPYRIGHT

Copyright (c) 2003-2007 imacat.

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program. If not, see http://www.gnu.org/licenses/.