The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

WWW::Scraper::Wikipedia::ISO3166::Database::Download - Download various pages from Wikipedia

Synopsis

See "Synopsis" in WWW::Scraper::Wikipedia::ISO3166.

Description

Downloads these pages:

Input: http://en.wikipedia.org/wiki/ISO_3166-1.

Output: data/en.wikipedia.org.wiki.ISO_3166-1.html.

Input: http://en.wikipedia.org/wiki/ISO_3166-2.

Output: data/en.wikipedia.org.wiki.ISO_3166-2.html.

Downloads each countries' corresponding subcountries page.

Source: http://en.wikipedia.org/wiki/ISO_3166:$code2.html.

Output: data/en.wikipedia.org.wiki.ISO_3166-2.$code2.html.

See scripts/get.country.pages.pl, scripts/get.subcountry.page.pl and scripts/get.subcountries.pages.pl.

Note: These pages have been downloaded, and are shipped with the distro.

Constructor and initialization

new(...) returns an object of type WWW::Scraper::Wikipedia::ISO3166::Database::Download.

This is the class's contructor.

Usage: WWW::Scraper::Wikipedia::ISO3166::Database::Download -> new().

This method takes a hash of options.

Call new() as new(option_1 => value_1, option_2 => value_2, ...).

Available options (these are also methods):

o code2 => $2_letter_code

Specifies the code2 of the country whose subcountry page is to be downloaded.

Distributions

This module is available as a Unix-style distro (*.tgz).

See http://savage.net.au/Perl-modules.html for details.

See http://savage.net.au/Perl-modules/html/installing-a-module.html for help on unpacking and installing.

Methods

This module is a sub-class of WWW::Scraper::Wikipedia::ISO3166::Database and consequently inherits its methods.

code2($code)

Get or set the 2-letter country code of the country or subcountry being processed.

See "get_subcountry_page()".

Also, code2 is an option to "new()".

get_1_page($url, $data_file)

Download $url and save it in $data_file. $data_file normally takes the form 'data/*.html'.

get_country_pages()

Download the 2 country pages:

http://en.wikipedia.org/wiki/ISO_3166-1.

http://en.wikipedia.org/wiki/ISO_3166-2.

See "Description" in WWW::Scraper::Wikipedia::ISO3166.

get_subcountry_page()

Download 1 subcountry page, e.g. http://en.wikipedia.org/wiki/ISO_3166:$code2.html.

Warning. The 2-letter code of the subcountry must be set with $self -> code2('XX') before calling this method.

See "Description" in WWW::Scraper::Wikipedia::ISO3166.

get_subcountry_pages()

Download all subcountry pages which have not been downloaded.

See "Description" in WWW::Scraper::Wikipedia::ISO3166.

new()

See "Constructor and initialization".

FAQ

For the database schema, etc, see "FAQ" in WWW::Scraper::Wikipedia::ISO3166.

References

See "References" in WWW::Scraper::Wikipedia::ISO3166.

Support

Email the author, or log a bug on RT:

https://rt.cpan.org/Public/Dist/Display.html?Name=WWW::Scraper::Wikipedia::ISO3166.

Author

WWW::Scraper::Wikipedia::ISO3166 was written by Ron Savage <ron@savage.net.au> in 2012.

Home page: http://savage.net.au/index.html.

Copyright

Australian copyright (c) 2012 Ron Savage.

        All Programs of mine are 'OSI Certified Open Source Software';
        you can redistribute them and/or modify them under the terms of
        The Artistic License, a copy of which is available at:
        http://www.opensource.org/licenses/index.html