NAME

Test::WWW::Mechanize - Testing-specific WWW::Mechanize subclass

VERSION

Version 1.60

SYNOPSIS

Test::WWW::Mechanize is a subclass of WWW::Mechanize that incorporates features for web application testing. For example:

    use Test::More tests => 5;
    use Test::WWW::Mechanize;

    my $mech = Test::WWW::Mechanize->new;
    $mech->get_ok( $page );
    $mech->base_is( 'http://petdance.com/', 'Proper <BASE HREF>' );
    $mech->title_is( 'Invoice Status', "Make sure we're on the invoice page" );
    $mech->text_contains( 'Andy Lester', 'My name somewhere' );
    $mech->content_like( qr/(cpan|perl)\.org/, 'Link to perl.org or CPAN' );

This is equivalent to:

    use Test::More tests => 5;
    use WWW::Mechanize;

    my $mech = WWW::Mechanize->new;
    $mech->get( $page );
    ok( $mech->success );
    is( $mech->base, 'http://petdance.com', 'Proper <BASE HREF>' );
    is( $mech->title, 'Invoice Status', "Make sure we're on the invoice page" );
    ok( index( $mech->content( format => 'text' ), 'Andy Lester' ) >= 0, 'My name somewhere' );
    like( $mech->content, qr/(cpan|perl)\.org/, 'Link to perl.org or CPAN' );

but has nicer diagnostics if they fail.

Default descriptions will be supplied for most methods if you omit them. e.g.

    my $mech = Test::WWW::Mechanize->new;
    $mech->get_ok( 'http://petdance.com/' );
    $mech->base_is( 'http://petdance.com/' );
    $mech->title_is( 'Invoice Status' );
    $mech->content_contains( 'Andy Lester' );
    $mech->content_like( qr/(cpan|perl)\.org/ );

results in

    ok - Got 'http://petdance.com/' ok
    ok - Base is 'http://petdance.com/'
    ok - Title is 'Invoice Status'
    ok - Text contains 'Andy Lester'
    ok - Content is like '(?-xism:(cpan|perl)\.org)'

CONSTRUCTOR

new( %args )

Behaves like, and calls, WWW::Mechanize's new method. Any parms passed in get passed to WWW::Mechanize's constructor.

You can pass in autolint => 1 to make Test::WWW::Mechanize automatically run HTML::Lint after any of the following methods are called. You can also pass in an HTML::Lint object like this:

    my $lint = HTML::Lint->new( only_types => HTML::Lint::Error::STRUCTURE );
    my $mech = Test::WWW::Mechanize->new( autolint => $lint );

The same is also possible with autotidy => 1 to use HTML::Tidy5.

  • get_ok()

  • post_ok()

  • submit_form_ok()

  • follow_link_ok()

  • click_ok()

This means you no longer have to do the following:

    my $mech = Test::WWW::Mechanize->new();
    $mech->get_ok( $url, 'Fetch the intro page' );
    $mech->html_lint_ok( 'Intro page looks OK' );

and can simply do

    my $mech = Test::WWW::Mechanize->new( autolint => 1 );
    $mech->get_ok( $url, 'Fetch the intro page' );

The $mech->get_ok() only counts as one test in the test count. Both the main IO operation and the linting must pass for the entire test to pass.

You can control autolint and autotidy on the fly with the autolint and autotidy methods.

METHODS: HTTP VERBS

$mech->get_ok($url, [ \%LWP_options ,] $desc)

A wrapper around WWW::Mechanize's get(), with similar options, except the second argument needs to be a hash reference, not a hash. Like well-behaved *_ok() functions, it returns true if the test passed, or false if not.

A default description of "GET $url" is used if none if provided.

$mech->head_ok($url, [ \%LWP_options ,] $desc)

A wrapper around WWW::Mechanize's head(), with similar options, except the second argument needs to be a hash reference, not a hash. Like well-behaved *_ok() functions, it returns true if the test passed, or false if not.

A default description of "HEAD $url" is used if none if provided.

$mech->post_ok( $url, [ \%LWP_options ,] $desc )

A wrapper around WWW::Mechanize's post(), with similar options, except the second argument needs to be a hash reference, not a hash. Like well-behaved *_ok() functions, it returns true if the test passed, or false if not.

NOTE Due to compatibility reasons it is not possible to pass additional LWP_options beyond form data via this method (such as Content or Content-Type). It is recommend that you use WWW::Mechanize's post() directly for instances where more granular control of the post is needed.

A default description of "POST to $url" is used if none if provided.

$mech->put_ok( $url, [ \%LWP_options ,] $desc )

A wrapper around WWW::Mechanize's put(), with similar options, except the second argument needs to be a hash reference, not a hash. Like well-behaved *_ok() functions, it returns true if the test passed, or false if not.

A default description of "PUT to $url" is used if none if provided.

$mech->delete_ok( $url, [ \%LWP_options ,] $desc )

A wrapper around WWW::Mechanize's delete(), with similar options, except the second argument needs to be a hash reference, not a hash. Like well-behaved *_ok() functions, it returns true if the test passed, or false if not.

A default description of "DELETE to $url" is used if none if provided.

$mech->submit_form_ok( \%parms [, $desc] )

Makes a submit_form() call and executes tests on the results. The form must be found, and then submitted successfully. Otherwise, this test fails.

%parms is a hashref containing the parms to pass to submit_form(). Note that the parms to submit_form() are a hash whereas the parms to this function are a hashref. You have to call this function like:

    $mech->submit_form_ok( {
            form_number => 3,
            fields      => {
                answer => 42
            },
        }, 'now we just need the question'
    );

As with other test functions, $desc is optional. If it is supplied then it will display when running the test harness in verbose mode.

Returns true value if the specified link was found and followed successfully. The HTTP::Response object returned by submit_form() is not available.

Makes a follow_link() call and executes tests on the results. The link must be found, and then followed successfully. Otherwise, this test fails.

%parms is a hashref containing the parms to pass to follow_link(). Note that the parms to follow_link() are a hash whereas the parms to this function are a hashref. You have to call this function like:

    $mech->follow_link_ok( {n=>3}, 'looking for 3rd link' );

As with other test functions, $desc is optional. If it is supplied then it will display when running the test harness in verbose mode.

Returns a true value if the specified link was found and followed successfully. The HTTP::Response object returned by follow_link() is not available.

$mech->click_ok( $button[, $desc] )

$mech->click_ok( \@button-and-coordinates [, $desc ] )

Clicks the button named by $button. An optional $desc can be given for the test.

    $mech->click_ok( 'continue', 'Clicking the "Continue" button' );

Alternatively the first argument can be an arrayref with three elements: The name of the button and the X and Y coordinates of the button.

    $mech->click_ok( [ 'continue', 12, 47 ], 'Clicking the "Continue" button' );

METHODS: HEADER CHECKING

$mech->header_exists_ok( $header [, $desc ] )

Assures that a given response header exists. The actual value of the response header is not checked, only that the header exists.

$mech->lacks_header_ok( $header [, $desc ] )

Assures that a given response header does NOT exist.

$mech->header_is( $header, $value [, $desc ] )

Assures that a given response header exists and has the given value.

$mech->header_like( $header, $value [, $desc ] )

Assures that a given response header exists and has the given value.

METHODS: CONTENT CHECKING

$mech->html_lint_ok( [$desc] )

Checks the validity of the HTML on the current page using the HTML::Lint module. If the page is not HTML, then it fails. The URI is automatically appended to the $desc.

Note that HTML::Lint must be installed for this to work. Otherwise, it will blow up.

$mech->html_tidy_ok( [$desc] )

Checks the validity of the HTML on the current page using the HTML::Tidy module. If the page is not HTML, then it fails. The URI is automatically appended to the $desc.

Note that HTML::tidy must be installed for this to work. Otherwise, it will blow up.

$mech->content_for_tidy()

This method is called by html_tidy_ok() to get the content that should be validated by HTML::Tidy5. By default, this is just content(), but subclasses can override it to modify the content before validation.

This method should not change any state in the Mech object. Specifically, it should not actually modify any of the actual content.

$mech->title_is( $str [, $desc ] )

Tells if the title of the page is the given string.

    $mech->title_is( 'Invoice Summary' );

$mech->title_like( $regex [, $desc ] )

Tells if the title of the page matches the given regex.

    $mech->title_like( qr/Invoices for (.+)/ );

$mech->title_unlike( $regex [, $desc ] )

Tells if the title of the page matches the given regex.

    $mech->title_unlike( qr/Invoices for (.+)/ );

$mech->base_is( $str [, $desc ] )

Tells if the base of the page is the given string.

    $mech->base_is( 'http://example.com/' );

$mech->base_like( $regex [, $desc ] )

Tells if the base of the page matches the given regex.

    $mech->base_like( qr{http://example.com/index.php?PHPSESSID=(.+)});

$mech->base_unlike( $regex [, $desc ] )

Tells if the base of the page matches the given regex.

    $mech->base_unlike( qr{http://example.com/index.php?PHPSESSID=(.+)});

$mech->content_is( $str [, $desc ] )

Tells if the content of the page matches the given string

$mech->content_contains( $str [, $desc ] )

Tells if the content of the page contains $str.

$mech->content_lacks( $str [, $desc ] )

Tells if the content of the page lacks $str.

$mech->content_like( $regex [, $desc ] )

Tells if the content of the page matches $regex.

$mech->content_unlike( $regex [, $desc ] )

Tells if the content of the page does NOT match $regex.

$mech->text_contains( $str [, $desc ] )

Tells if the text form of the page's content contains $str.

When your page contains HTML which is difficult, unimportant, or unlikely to match over time as designers alter markup, use text_contains instead of content_contains.

 # <b>Hi, <i><a href="some/path">User</a></i>!</b>
 $mech->content_contains('Hi, User'); # Fails.
 $mech->text_contains('Hi, User'); # Passes.

Text is determined by calling $mech->text(). See "content" in WWW::Mechanize.

$mech->text_lacks( $str [, $desc ] )

Tells if the text of the page lacks $str.

$mech->text_like( $regex [, $desc ] )

Tells if the text form of the page's content matches $regex.

$mech->text_unlike( $regex [, $desc ] )

Tells if the text format of the page's content does NOT match $regex.

$mech->has_tag( $tag, $text [, $desc ] )

Tells if the page has a $tag tag with the given content in its text.

$mech->has_tag_like( $tag, $regex [, $desc ] )

Tells if the page has a $tag tag with the given content in its text.

Follow all links on the current page and test for HTTP status 200

    $mech->page_links_ok('Check all links');

Follow all links on the current page and test their contents for $regex.

    $mech->page_links_content_like( qr/foo/,
      'Check all links contain "foo"' );

Follow all links on the current page and test their contents do not contain the specified regex.

    $mech->page_links_content_unlike(qr/Restricted/,
      'Check all links do not contain Restricted');

Follow specified links on the current page and test for HTTP status 200. The links may be specified as a reference to an array containing WWW::Mechanize::Link objects, an array of URLs, or a scalar URL name.

    my @links = $mech->find_all_links( url_regex => qr/cnn\.com$/ );
    $mech->links_ok( \@links, 'Check all links for cnn.com' );

    my @links = qw( index.html search.html about.html );
    $mech->links_ok( \@links, 'Check main links' );

    $mech->links_ok( 'index.html', 'Check link to index' );

Follow specified links on the current page and test for HTTP status passed. The links may be specified as a reference to an array containing WWW::Mechanize::Link objects, an array of URLs, or a scalar URL name.

    my @links = $mech->followable_links();
    $mech->link_status_is( \@links, 403,
      'Check all links are restricted' );

Follow specified links on the current page and test for HTTP status passed. The links may be specified as a reference to an array containing WWW::Mechanize::Link objects, an array of URLs, or a scalar URL name.

    my @links = $mech->followable_links();
    $mech->link_status_isnt( \@links, 404,
      'Check all links are not 404' );

Follow specified links on the current page and test the resulting content of each against $regex. The links may be specified as a reference to an array containing WWW::Mechanize::Link objects, an array of URLs, or a scalar URL name.

    my @links = $mech->followable_links();
    $mech->link_content_like( \@links, qr/Restricted/,
        'Check all links are restricted' );

Follow specified links on the current page and test that the resulting content of each does not match $regex. The links may be specified as a reference to an array containing WWW::Mechanize::Link objects, an array of URLs, or a scalar URL name.

    my @links = $mech->followable_links();
    $mech->link_content_unlike( \@links, qr/Restricted/,
      'No restricted links' );

METHODS: SCRAPING

$mech->scrape_text_by_attr( $attr, $attr_value [, $html ] )

$mech->scrape_text_by_attr( $attr, $attr_regex [, $html ] )

Returns a list of strings, each string the text surrounded by an element with attribute $attr of value $value. You can also pass in a regular expression. If nothing is found the return is an empty list. In scalar context the return is the first string found.

If passed, $html is scraped instead of the current page's content.

$mech->scrape_text_by_id( $id [, $html ] )

Finds all elements with the given ID attribute and pulls out the text that that element encloses.

In list context, returns a list of all strings found. In scalar context, returns the first one found.

If $html is not provided then the current content is used.

$mech->scraped_id_is( $id, $expected [, $msg] )

Scrapes the current page for given ID and tests that it matches the expected value.

$mech->scraped_id_like( $id, $expected_regex [, $msg] )

Scrapes the current page for given id and tests that it matches the expected regex.

$mech->id_exists( $id )

Returns TRUE/FALSE if the given ID exists in the given HTML, or if none is provided, then the current page.

The Mech object caches the IDs so that it doesn't bother reparsing every time it's asked about an ID.

$agent->id_exists_ok( $id [, $msg] )

Verifies there is an HTML element with ID $id in the page.

$agent->ids_exist_ok( \@ids [, $msg] )

Verifies an HTML element exists with each ID in \@ids.

$agent->lacks_id_ok( $id [, $msg] )

Verifies there is NOT an HTML element with ID $id in the page.

$agent->lacks_ids_ok( \@ids [, $msg] )

Verifies there are no HTML elements with any of the ids given in \@ids.

$mech->button_exists( $button )

Returns a boolean saying whether a submit button with the name $button exists. Does not do a test. For that you want button_exists_ok or lacks_button_ok.

$mech->button_exists_ok( $button [, $msg] )

Asserts that the button exists on the page.

$mech->lacks_button_ok( $button [, $msg] )

Asserts that no button named $button exists on the page.

METHODS: MISCELLANEOUS

$mech->autolint( [$status] )

Without an argument, this method returns a true or false value indicating whether autolint is active.

When passed an argument, autolint is turned on or off depending on whether the argument is true or false, and the previous autolint status is returned. As with the autolint option of new, $status can be an HTML::Lint object.

If autolint is currently using an HTML::Lint object you provided, the return is that object, so you can change and exactly restore autolint status:

    my $old_status = $mech->autolint( 0 );
    ... operations that should not be linted ...
    $mech->autolint( $old_status );

$mech->autotidy( [$status] )

Without an argument, this method returns a true or false value indicating whether autotidy is active.

When passed an argument, autotidy is turned on or off depending on whether the argument is true or false, and the previous autotidy status is returned. As with the autotidy option of new, $status can be an HTML::Tidy5 object.

If autotidy is currently using an HTML::Tidy5 object you provided, the return is that object, so you can change and exactly restore autotidy status:

    my $old_status = $mech->autotidy( 0 );
    ... operations that should not be tidied ...
    $mech->autotidy( $old_status );

$mech->grep_inputs( \%properties )

Returns a list of all the input controls in the current form whose properties match all of the regexes in $properties. The controls returned are all descended from HTML::Form::Input.

If $properties is undef or empty then all inputs will be returned.

If there is no current page, there is no form on the current page, or there are no submit controls in the current form then the return will be an empty list.

    # Get all text controls whose names begin with "customer".
    my @customer_text_inputs =
        $mech->grep_inputs( {
            type => qr/^(text|textarea)$/,
            name => qr/^customer/
        }
    );

$mech->grep_submits( \%properties )

grep_submits() does the same thing as grep_inputs() except that it only returns controls that are submit controls, ignoring other types of input controls like text and checkboxes.

$mech->stuff_inputs( [\%options] )

Finds all free-text input fields (text, textarea, and password) in the current form and fills them to their maximum length in hopes of finding application code that can't handle it. Fields with no maximum length and all textarea fields are set to 66000 bytes, which will often be enough to overflow the data's eventual receptacle.

There is no return value.

If there is no current form then nothing is done.

The hashref $options can contain the following keys:

  • ignore

    hash value is arrayref of field names to not touch, e.g.:

        $mech->stuff_inputs( {
            ignore => [qw( specialfield1 specialfield2 )],
        } );
  • fill

    hash value is default string to use when stuffing fields. Copies of the string are repeated up to the max length of each field. E.g.:

        $mech->stuff_inputs( {
            fill => '@'  # stuff all fields with something easy to recognize
        } );
  • specs

    hash value is arrayref of hashrefs with which you can pass detailed instructions about how to stuff a given field. E.g.:

        $mech->stuff_inputs( {
            specs=>{
                # Some fields are datatype-constrained.  It's most common to
                # want the field stuffed with valid data.
                widget_quantity => { fill=>'9' },
                notes => { maxlength=>2000 },
            }
        } );

    The specs allowed are fill (use this fill for the field rather than the default) and maxlength (use this as the field's maxlength instead of any maxlength specified in the HTML).

$mech->followable_links()

Returns a list of links that Mech can follow. This is only http and https links.

$mech->lacks_uncapped_inputs( [$comment] )

Executes a test to make sure that the current form content has no text input fields that lack the maxlength attribute, and that each maxlength value is a positive integer. The test fails if the current form has such a field, and succeeds otherwise.

Checks that all text input fields in the current form specify a maximum input length. Fields for which the concept of input length is irrelevant, and controls that HTML does not allow to be capped (e.g. textarea) are ignored.

The return is true if the test succeeded, false otherwise.

$mech->check_all_images_ok( [%criterium ], [$comment] )

Executes a test to make sure all images in the page can be downloaded. It does this by running HEAD requests on them. The current page content stays the same.

The test fails if any image cannot be found, but reports all of the ones that were not found.

For a definition of all images, see imagesin WWW::Mechanize.

The optional %criterium argument can be passed in before the $comment and will be used to define which images should be considered. This is useful to filter out specific paths.

    $mech->check_all_images_ok( url_regex => qr{^/}, 'All absolute images should exist');
    $mech->check_all_images_ok( url_regex => qr{\.(?:gif|jpg)$}, 'All gif and jpg images should exist');
    $mech->check_all_images_ok(
        url_regex => qr{^((?!\Qhttps://googleads.g.doubleclick.net/\E).)*$},
        'All images should exist, but Ignore the ones from Doubleclick'
    );

For a full list of possible arguments see find_all_imagesin WWW::Mechanize.

The return is true if the test succeeded, false otherwise.

TODO

Other ideas for features are at https://github.com/petdance/test-www-mechanize

AUTHOR

Andy Lester, <andy at petdance.com>

BUGS

Please report any bugs or feature requests to <https://github.com/petdance/test-www-mechanize>.

SUPPORT

You can find documentation for this module with the perldoc command.

    perldoc Test::WWW::Mechanize

You can also look for information at:

ACKNOWLEDGEMENTS

Thanks to Julien Fiegehenn, @marderh, Eric A. Zarko, @moznion, Robert Stone, @tynovsky, Jerry Gay, Jonathan "Duke" Leto, Philip G. Potter, Niko Tyni, Greg Sheard, Michael Schwern, Mark Blackman, Mike O'Regan, Shawn Sorichetti, Chris Dolan, Matt Trout, MATSUNO Tokuhiro, and Pete Krawczyk for patches.

COPYRIGHT & LICENSE

Copyright 2004-2022 Andy Lester.

This library is free software; you can redistribute it and/or modify it under the terms of the Artistic License version 2.0.