View on
MetaCPAN
Peter H. Li > WebService-Nextbus-0.12 > WebService::Nextbus

Download:
WebService-Nextbus-0.12.tar.gz

Dependencies

Annotate this POD

View/Report Bugs
Module Version: 0.12   Source  

NAME ^

WebService::Nextbus - A screen scraper useful for propagating the data structure of WebService::Nextbus::Agency.

SYNOPSIS ^

  use WebService::Nextbus;
  $nb = new WebService::Nextbus;
  $nb->buildAgency('sf-muni'); # Scraping the webpages repeatedly can take time
  @stops = $nb->agencies->{'sf-muni'}->str2stopCodes('N', 'judah', 'Chu Dub');

@stops can now be used as valid GET arguments on the nextbus webpage.

DESCRIPTION ^

WebService::Nextbus can determine the relevant GET arguments for queries to the Nextbus website (www.nextbus.com) by screen scraping. WebService::Nextbus::Agency implements a basic data structure for storing and retrieving the information gleaned by this screen scraping.

Once the proper GET code has been retrieved, a web useragent can use the argument to build a URL for the desired information. This useragent function will probably eventually be incorporated into WebService::Nextbus.

The screen scraping is done without any additional required HTML parser module. I did this to improve interoperability, but the parsing is therefore necessarily crude and perhaps not as fast as it could be (it uses RegExps rather than a state machine). This shouldn't be a major issue, however; although running the initial screen scraping, with buildAgency for example, can be slow, you should be able to store the results (using Storable for example) and then retrieve them quickly. This should work well since the data don't change all that frequently.

For example:

  # As above (use emery agency for example because it's smaller, faster)
  use WebService::Nextbus;
  $nb = new WebService::Nextbus;
  $nb->buildAgency('emery'); # Scraping the webpages repeatedly can take time

  # Now store the resulting agency, retrieve it, and dump its contents
  use Storable qw(nstore);
  nstore($nb->agencies->{'emery'}, 'emery.store');
  $agency = retrieve('emery.store');
  print $agency->routesAsString;

  # Or store just the routes tree, retrieve it, and dump its contents
  nstore($nb->agencies->{'emery'}->routes, 'emery_routes.store');
  $agency = new WebService::Nextbus::Agency;
  $agency->routes(retrieve('emery_routes.store'));
  print $agency->routesAsString;

EXPORT

None by default; OO interface.

REQUIRES ^

Requires the LWP::UserAgent module and the WebService::Nextbus::Agency package. Tests require the Test::More module.

AUTHOR ^

Peter H. Li<lt>phli@cpan.org<gt>

COPYRIGHT ^

Licensed by Creative Commons http://creativecommons.org/licenses/by-nc-sa/2.0/

SEE ALSO ^

WebService::Nextbus::Agency, LWP::UserAgent, perl.

syntax highlighting: