WWW::Crawler::Mojo::ScraperUtil - Scraper utitlities
This class inherits Mojo::UserAgent and override start method for storing user info
WWW::Crawler::Mojo::ScraperUtil implements following attributes.
WWW::Crawler::Mojo::ScraperUtil implements following methods.
Collects URLs out of CSS.
@urls = collect_urls_css($dom);
Returns decoded response body for given Mojo::Message::Request using guess_encoding and encoder.
Generates Encode instance for given name. Defaults to Encode::utf8.
Returns common html handler in hash reference.
my $handlers = html_handlers();
Narrows html handler selectors by prefixing container CSS snippets.
my $handlers = html_handlers($handlers, ['#header', '#footer li']); $handlers->{img} = sub { my $dom = shift; return $dom->{src}; }; my @urls; for my $selector (sort keys %{$handlers}) { $dom->find($selector)->each(sub { push(@urls, $handlers->{$selector}->(shift)); })->to_array; }
Resolves URLs with a base URL.
WWW::Crawler::Mojo::resolve_href($base, $uri);
Guesses encoding of HTML or CSS with given Mojo::Message::Response instance.
$encode = WWW::Crawler::Mojo::guess_encoding($res) || 'utf-8'
Keita Sugama, <sugama@jamadam.com>
Copyright (C) Keita Sugama.
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
To install WWW::Crawler::Mojo, copy and paste the appropriate command in to your terminal.
cpanm
cpanm WWW::Crawler::Mojo
CPAN shell
perl -MCPAN -e shell install WWW::Crawler::Mojo
For more information on module installation, please visit the detailed CPAN module installation guide.