The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

HoneyClient::Agent::Driver::Browser::FF - Perl extension to drive Mozilla Firefox to a given web page. This package extends the HoneyClient::Agent::Driver::Browser package, by overridding the drive() method.

VERSION

This documentation refers to HoneyClient::Agent::Driver::Browser::FF version 0.98.

SYNOPSIS

  use HoneyClient::Agent::Driver::Browser::FF;

  # Library used exclusively for debugging complex objects.
  use Data::Dumper;

  # Create a new FF object, initialized with a collection
  # of URLs to visit.
  my $browser = HoneyClient::Agent::Driver::Browser::FF->new(
      links_to_visit => {
          'http://www.google.com'  => 1,
          'http://www.cnn.com'     => 1,
      },
  );

  # If you want to see what type of "state information" is physically
  # inside $browser, try this command at any time.
  print Dumper($browser);

  # Continue to "drive" the driver, until it is finished.
  while (!$browser->isFinished()) {

      # Before we drive the application to a new set of resources,
      # find out where we will be going within the application, first.
      print "About to contact the following resources:\n";
      print Dumper($browser->next());

      # Now, drive browser for one iteration.
      $browser->drive();

      # Get the driver's progress.
      print "Status:\n";
      print Dumper($browser->status());

  }

  # At this stage, the driver has exhausted its collection of links
  # to visit.  Let's say we want to add the URL "http://www.mitre.org"
  # to the driver's list.
  $browser->{links_to_visit}->{'http://www.mitre.org'} = 1;

  # Now, drive the browser for one iteration.
  $browser->drive();
  
  # Or, we can specify the URL as an argument.
  $browser->drive(url => "http://www.mitre.org");

DESCRIPTION

This library allows the Agent module to drive an instance of Mozilla Firefox inside the HoneyClient VM. The purpose of this module is to programmatically navigate this browser to different websites, in order to become purposefully infected with new malware.

This module is object-oriented in design, retaining all state information within itself for easy access. This specific browser implementation inherits all code from the HoneyClient::Agent::Driver::Browser package.

Fundamentally, the FF driver is initialized with a set of absolute URLs for the browser to drive to. Upon visiting each URL, the driver collects any new links found and will attempt to drive the browser to each valid URL upon subsequent iterations of work.

For each top-level URL given, the driver will attempt to process all corresponding links that are hosted on the same server, in order to simulate a complete 'spider' of each server.

URLs are added and removed from hashtables, as keys. For each URL, a calculated "priority" (a positive integer) of the URL is assigned the value. When the FF driver is ready to go to a new link, it will always go to the next link that has the highest priority. If two URLs have the same priority, then the FF driver will chose among those two at random.

Furthermore, the FF driver will try to visit all links shared by a common server in order before moving on to drive to other, external links in an ordered fashion. However, the FF driver may end up re-visiting old sites, if new links were found that the FF driver has not visited yet.

As the FF driver navigates the browser to each link, it maintains a set of hashtables that record when valid links were visited (see links_visited); when invalid links were found (see links_ignored); and when the browser attempted to visit a link but the operation timed out (see links_timed_out). By maintaining this internal history, the driver will never navigate the browser to the same link twice.

Lastly, it is highly recommended that for each driver $object, one should call $object->isFinished() prior to making a subsequent call to $object->drive(), in order to verify that the driver has not exhausted its set of links to visit. Otherwise, if $object->drive() is called with an empty set of links to visit, the corresponding operation will croak.

METHODS OVERRIDDEN

The following functions have been overridden by the FF driver. All other methods were implemented by the generic Browser driver. For further information about the Browser driver, see the HoneyClient::Agent::Driver::Browser documentation.

$object->drive(url => $url)

    Drives an instance of Mozilla Firefox for one iteration, navigating either to the specified URL or to the next URL computed within the Browser driver's internal hashtables.

    For a description of which hashtable is consulted upon each iteration of drive(), see the next_link_to_visit description of the HoneyClient::Agent::Driver::Browser documentation, in the "DEFAULT PARAMETER LIST" section.

    Once a drive() iternation has completed, the corresponding browser process is terminated. Thus, each call to drive() invokes a new instance of the browser.

    Inputs: $url is an optional argument, specifying the next immediate URL the browser must drive to.

    Output: The updated FF driver $object, containing state information from driving the browser for one iteration.

    Warning: This method will croak, if the FF driver object is unable to navigate to a new link, because its list of links to vist is empty and no new URL was supplied.

BUGS & ASSUMPTIONS

This package will only run on Win32 platforms. Furthermore, it has only been tested to work reliably within a Cygwin environment.

In a nutshell, this object is nothing more than a blessed anonymous reference to a hashtable, where (key => value) pairs are defined in the "DEFAULT PARAMETER LIST", as well as fed via the new() function during object initialization. As such, this package does not perform any rigorous data validation prior to accepting any new or overriding (key => value) pairs.

However, additional links can be fed to any FF driver at any time, by simply adding new hashtable entries to the links_to_visit hashtable within the $object.

For example, if you wanted to add the URL "http://www.mitre.org" to the FF driver $object, simply use the following code:

  $object->{links_to_visit}->{'http://www.mitre.org'} = 1;

In general, the FF driver does not know how many links it will ultimately end up browsing to, until it conducts an exhaustive spider of all initial URLs supplied. As such, expect the output of $object->status() to change significantly, upon each $object->drive() iteration.

For example, if at one given point, the status of percent_complete is 30% and then this value drops to 15% upon another iteration, then this means that the total number of links to drive to has greatly increased.

Lastly, we assume that the Mozilla Firefox browser has been preconfigured to not cache any data. This ensures the browser will render the most recent version of the content hosted at each URL.

SEE ALSO

HoneyClient::Agent::Driver

HoneyClient::Agent::Driver::Browser

HoneyClient::Agent::Driver::Browser::IE

http://www.honeyclient.org/trac

REPORTING BUGS

http://www.honeyclient.org/trac/newticket

AUTHORS

Kathy Wang, <knwang@mitre.org>

Thanh Truong, <ttruong@mitre.org>

Darien Kindlund, <kindlund@mitre.org>

Brad Stephenson, <stephenson@mitre.org>

COPYRIGHT & LICENSE

Copyright (C) 2007 The MITRE Corporation. All rights reserved.

This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, using version 2 of the License.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.