The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Dezi::Bot - web crawler

SYNOPSIS

 use Dezi::Bot;

 my $bot = Dezi::Bot->new(
 
    # give your bot a name
    name => 'dezibot',  
    
    # explicit object, instead of class+config
    spider => $spider_object,  
     
    # every crawled URI
    # passed to the $handler->handle() method
    handler_class => 'Dezi::Bot::Handler',
    
    # default
    spider_class => 'Dezi::Bot::Spider',
    
    # passed to spider_class->new()
    spider_config   => {
        agent      => 'dezibot ' . $Dezi::Bot::VERSION,
        email      => 'bot@dezi.org',
        max_depth  => 4,
    },
    
    # default
    cache_class => 'Dezi::Bot::Cache',
    
    # passed to cache_class->new()
    cache_config => {
        driver      => 'File',
        root_dir    => '/tmp/dezibot',
    },
    
    # default
    queue_class => 'Dezi::Bot::Queue',
    
    # passed to queue_class->new()
    queue_config => {
        type     => 'DBI',
        dsn      => "DBI:mysql:database=dezibot;host=localhost;port=3306",
        username => 'myuser',
        password => 'mysecret',
    },
 );
 
 $bot->crawl('http://dezi.org');

DESCRIPTION

The Dezi::Bot module is a web crawler optimized for parallel use across multiple hosts.

METHODS

init( args )

Overrides the base method to set default options based on args. See the SYNOPSIS.

Options:

name
spider
handler_class
handler_config
spider_class
spider_config
cache_class
cache_config
queue_class
queue_config

crawl( urls )

Calls ->spider->crawl() for an array of urls.

Returns the total number of URIs crawled.

AUTHOR

Peter Karman, <karman at cpan.org>

BUGS

Please report any bugs or feature requests to bug-dezi-bot at rt.cpan.org, or through the web interface at http://rt.cpan.org/NoAuth/ReportBug.html?Queue=Dezi-Bot. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.

SUPPORT

You can find documentation for this module with the perldoc command.

    perldoc Dezi::Bot

You can also look for information at:

COPYRIGHT & LICENSE

Copyright 2013 Peter Karman.

This program is free software; you can redistribute it and/or modify it under the terms of either: the GNU General Public License as published by the Free Software Foundation; or the Artistic License.

See http://dev.perl.org/licenses/ for more information.