linktractor - extract links from HTML
% linktractor fileA.html fileB.html % linktractor -f=http://www.perl.com % lwp-request http://www.example.com | linktractor % lwp-request http://www.example.com | linktractor -b=http://www.example.com
This is a small script that uses HTML::SimpleLinkExtractor to pull all the HTML links out of the input HTML. It can take input from files you specify on the command line (or standard input), or fetch a URL.
The -b
switch sets the base URL to resolve relative URLs in the input.
Instead of reading from files specified on the command line or standard input, fetch this URL and use it as input.
This source is part of a SourceForge project which always has the latest sources in CVS, as well as all of the previous releases.
http://sourceforge.net/projects/brian-d-foy/
If, for some reason, I disappear from the world, one of the other members of the project can shepherd this module appropriately.
brian d foy, <bdfoy@cpan.org>
Copyright (c) 2007 brian d foy. All rights reserved.
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
You may use HTML::SimpleLinkExtor under the same terms as Perl itself.