Peter Karman > Dezi-Bot > dezibot


Annotate this POD


View/Report Bugs


dezibot - parallel web crawler


 # crawl 2 sites
 % dezibot
 # crawl a list of sites
 % dezibot --urls file_with_urls
 # pass in stored config
 % dezibot --config
 # crawl in parallel
 % dezibot --workers 5 --urls file_with_urls


dezibot is a command line tool wrapping the Dezi::Bot module.

dezibot can:


The following options are supported.


Print this message.


Spew lots of information to stderr. Overrides any setting in --config.


Print some status information to stderr. Overrides any setting in --config.

--config file

Read config from file using Config::Any. The parsed config is passed directly to Dezi::Bot->new().

--urls file

Read URLs to crawl from file. Lines starting with whitespace or # are ignored.

--workers n

Spawn n workers to crawl in parallel. The default is to crawl serially. If n is less than the number of URLs, the list of URLs will be sliced and apportioned among the n workers according to --pool_size.

--pool_size n

The max number of URLs per worker. Default is to divide the number of URLs by the number of workers, but you might want to set the size n to a lower number in order to minimize wait time between crawls.


Peter Karman, <karman at>


Please report any bugs or feature requests to bug-dezi-bot at, or through the web interface at I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.


You can find documentation for this module with the perldoc command.

    perldoc Dezi::Bot

You can also look for information at:


Copyright 2013 Peter Karman.

This program is free software; you can redistribute it and/or modify it under the terms of either: the GNU General Public License as published by the Free Software Foundation; or the Artistic License.

See for more information.

syntax highlighting: