The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.
=pod

=head1 NAME

TB2::Design - Explaining the design of Test::Builder2

=head1 DESCRIPTION

This document is about the design and philosophy of Test::Builder2. It is
written for the curious, as well as for interested in contributing to the 
project, designing their own advanced testing modules, or extending 
Test::Builder2.

Having a gander at L<Test::Builder> may also be useful, if you're not already 
familiar with it.

There is a glossary of terms at the end of this document.


=head1 What is Test::Builder2?

L<Test::Builder2> is a complete rewrite of L<Test::Builder> with fewer
assumptions about how testing is to be done, and more extensibility.

So what is Test::Builder?  It is the object that backs all the Perl testing
modules worth knowing about.  It provides the baseline functionality of
testing, coordinates testing modules and formats the results as TAP.  It is
what allows you to use multiple independent C<Test::> modules, written by 
completely different authors, together in the same test script without
them stepping on each other.

Fundamentally, TB2 coordinates these things:

=over 4

=item Recording when a test starts and ends

TB2 allows a C<Test::> module to set a plan, and to print out headers, if 
required.

=item Recording results

TB2 hands results to a TB2::History object.

=item Formatting results

TB2 hands results to a TB2::Formatter object.

=item Streaming (outputting) results

The TB2::Formatter hands results to a TB2::Streamer.

=back

In short, you hand TB2 the result of an assert, and it decides when and 
how to format and output it.  That's about it (rjbs gets credit for 
bringing this revelation of simplicity).

TB2 also takes on the additional meta-responsibility of providing a central 
point to coordinate extensions and to provide hooks into testing events.


=head1 Design Goals

To understand Test::Builder2 you must understand its design goals.

Test::Builder2 takes a very long and broad view in its design.  As
such, many things which might seem overcomplicated at first glance have
been deliberately thought out to cover an obscure, but important, condition.


=head2 Universal

If TB2 were a person, "thou shalt be universally applicable" would be
tattooed upside down on its belly in flaming, writhing letters, to remind
itself of its responsbility to B<all> Perl users everywhere.

This is the goal which has the farthest reaching consequences.  Every
testing module worth knowing about is backed by TB1 and will be backed by
TB2. As a result, TB2 I<has to work>, it has to work I<everywhere Perl
does> and it has to be able to test every conceivable thing.

Any assumption TB2 made about its environment, how testing is or is not
done, or what is being tested, could eliminate a whole category of Perl 
programmers from easily being able to test their code, knocking the entire 
modern suite of Perl testing tools from their hands. In order to be universally
useful, TB2 cannot make such assumptions.

While TB2 has to be universal, extensions do not.  TB2 will reject many
features because they are not universally applicable, but extensions and
test libraries built on top of it can add those features.


=head3 Portable

TB2 has to work in every environment where Perl does.  If it doesn't, TB2
de facto eliminates Perl from serious use in that environment.  TB2 can make
no assumptions about its environment that are not universal.  There's no
"it'll work on most systems" compromises.

As far as Perl goes, TB2 supports back to 5.8.1.  5.8.0's threading is too
unstable.  5.6 is missing too many features (the biggest is scalar
reference filehandles, critical for debugging) and is all but extinct in the
wild.  Even Debian oldstable ships 5.8.8.


=head3 Reliable

TB2 has to be completely reliable, allowing for total confidence in the test 
libraries. Users need to know that a test failure indicates a failed test,
and not a bug in TB2.  There are no "this works 99% of the time" 
implementations.


=head3 Extensible

Users must be able to write test libraries on top of TB2 that do pretty
much anything.

There are three primary ways of doing this.  The most common will be
writing a test module using L<TB2::Module>.  Users will write
logic to test their statement, and then hand the result off to a TB2 object
for processing.

The second is by adding roles to TB2.  For those not familiar with roles,
they're kind of like object plugins... sort of.

The third is by hooking into events, like when an assert starts or
finishes.

See L<Extending TB2> for more details.


=head3 Multiple Formats

TB1 outputs only TAP.  TB2 will output TAP by default, but can be extended
to output any format you like.  As such, its internal structures have to be
free of assumptions.  For example, TAP is one of the few testing systems
that requires a test count. TB1 is riddled with that assumption. TB2 cannot be.

See L<TB2::Format> for details.


=head3 Multiple Streams

TB1 has limited support for outputting to anywhere other than STDOUT and
STDERR.  TB2 should have full control over where and how output occurs.

See L<TB2::Streamer> for details.


=head3 Minimal Assumptions

TB1 and L<Test::More> avoid making assumptions about how and what you're
testing, sticking to functions that provide unambiguous functionality and
are universally applicable.  That is, you're not going to find code that
tests XML.

TB2 pushes this further, in ways already mentioned, and by stripping itself
down at the core to just storing and formatting results.  It
does not provide anything but the equivalent of C<ok()>.  Additional
functionality will be available in various roles that ship with TB2.  Most
library authors will use those roles by default, but radical extensions may
use the stripped down TB2.

Two drivers for this are L<Fennec> and the need for L<Test::Builder> itself
to use TB2.  The less TB2 does, the weirder test library authors can get.


=head3 No Dependencies

TB2 can have no external dependencies (except Perl itself).  External
dependencies may create circular dependencies, and worse, could introduce 
unreliabilies to TB2.


=head2 Easy

It has to be very easy to write a basic testing module using TB2.  The
underlying complexity must be hidden from the casual test module author.


=head1 Extending TB2

See F<examples/TB2> in the source directory for examples of TB2 extensions.

=head2 Writing Test Libraries

The simplest way to extend TB2 is to write a test library. 
L<TB2::Module> provides the conveniences to the test author. 
L<Test::Simple> is a simple example of a Test::Builder2 module.

=head2 Writing TB2 Roles

For internal use, roles can be applied to the TB2 default to expand its
functionality while leaving the TB2 class slim.  TB2 will include a 
TB2::Assert::More role that provides most of the Test::More functionality 
currently available in TB1.

=head2 TB2 Events

The primary way test libraries using TB2 will alter the behavior of tests is by
writing a test event handler.  Everything of interest in a test, from setting 
the plan to diagnostic output to asserts, triggers an event. A handler can take 
action based on an event, or even alter an event as it happens.  Event handlers
will allow TB2 to halt on test failure, to enforce test names, and to provide the 
functionality of Test::NoWarnings.  

Formatters are also event handlers.  See L<TB2::EventHandler>, 
L<TB2::EventCoordinator> and L<TB2::Events> for details.


=head1 Use Cases

=head2 Diagnostics contain the file and line number where the user called the assert

When an assert fails, it should report its file and line number so the user
can easily find the failing assert.  This is one of the trickiest aspects
 of TB1 and TB2.  It has its own section, L</File and Line Information>.


=head2 Test::NoWarnings

To capture warnings, TB2 must be able to add code at the start of a test suite, and 
to add an assertion at the end.  It also must be able to add an assertion to a
user-set plan.

See F<examples/TB2/lib/TB2/NoWarnings.pm> for an example.

=head2 Test::Warn

This is the case where an assert called inside another assert is B<not>
part of the same stack.

See L</File and Line Information> for details.

=head2 Testing a test

You should be able to easily write tests for test modules without having to
hard code formatted results.

=head2 Test::Builder

Test::Builder and all derived modules should continue to work and be
compatible with TB2.

=head2 Fennec

Fennec should be able to use TB2 for all its functionality.

=head2 Die on fail

It should be possible to have an extension which causes the test suite to
halt upon the first failed assert B<and> still receive all the diagnostics
of that assert.

See F<examples/TB2/lib/TB2/DieOnFail.pm> for an example.


=head2 Debug on fail

It should be possible to have an extension which starts the debugger when
an assert fails.

See F<examples/TB2/lib/TB2/DebugOnFail.pm> for an example.


=head2 Action on failed (or passed) test

It should be possible to perform an action when a test completes in
certain states.  For example, beep on completion or send an email on
failure.


=head1 Stacked Asserts

TB2 lets asserts build on other asserts by just calling them.  For
example, here is a simple implementation of is().

    install_test is => sub {
        my($have, $want, $name) = @_;

        my $result = ok( $have eq $want, $name );
        $result->diag([
            have => $have,
            want => $want
        ]);

        return $result;
    };

is() uses ok() to do most of the work.  Then it adds its diagnostics
to the result (an overloaded object in TB2) and passes it along.

To accomplish this, TB2 records the stack of asserts being called and who
called them.  This turns out to be one of the more involved parts of TB2,
but it's necessary for some critical features.


=head2 File and Line Information

One of the friendliest and trickiest features of TB1 is correctly reporting
the file and line where an assert failed.  This is because asserts call
asserts which call asserts which ultimately call C<ok> which does the
actual passing or failing.  It needs to know where the user originally
called the assert, the "top" of the assert stack.

This is similar in functionality to Carp, but it has to be far more robust.
 It cannot guess based on crossing package boundaries, it has to know.  TB1
accomplished this by keeping track of how far down the stack you are at any
moment with C<$Level>.  This results in complicated accounting that's often
not quite right, and it bubbles up to the user's level, where the user must
remember to localize and increment C<$Level>.

You must be able to trivially wrap an assert in another assert and still
have the file and line number come out B<at the outermost assert> which is
presumably the one the user wrote.  This will allow users to quickly and
trivially compose new domain specific asserts without having to know about
TB2.

For example, here is how is() would ideally be implemented by a user
(ignoring diagnostics for the moment):

    sub is {
        my($have, $want, $name) = @_;
        
        return ok $have eq $want, $name;
    }

When C<is()> fails, you want the diagnostics to contain the file and line
number of the call to is(), not the call to C<ok()> inside is().


=head2 Assert end actions

This goes beyond just file and line numbers.  It also allows actions
to happen when an assert fully completes, such as "die-on-fail".  To
expand on the is() example above:

    sub is {
        my($have, $want, $name) = @_;
        
        my $ok = ok $have, $want, $name;

        $ok->diag([
            have => $have,
            want => $want,
        ]);

        return $ok;
    }

is() takes the result from ok() (now an object) and adds its own
diagnostics.  If ok() were to fail, and die-on-fail is active, TB2 must
know to wait until is() has had a chance to add its diagnostics and print
the result before failing.  Otherwise you don't get full diagnostics about
the failure.  TB2 must know that is() was called by the user and ok() was
not.


=head2 Result output

Finally, TB2 must wait until the entire assert stack has had an opportunity
to add diagnostics to the result before it can print the result.  Why? 
TB1, only doing TAP, is fortunate in that TAP a streaming protocol.  It can
print a result as soon as it gets it, and then append diagnostics onto the end. 
But this isn't true of other output formats.  XML, for example, requires an
opening and closing tag.


=head2 Declaring Asserts

We B<cannot> assume that asserts are exported functions.  Or that every
function in a C<Test::> library is an assert.  TB2 takes the approach of
having a test library declare that a function is an assert.  This is done
with the least fanfare possible:

    package My::Test::Module;
    use TB2::Module;

    install_test is => sub {
        my($have, $want, $name) = @_;
        return ok $have, $want, $name;
    };

C<install_test> is exported by TB2::Module.  It wraps the user's function
in a little shim that triggers assert_start and assert_end events.  It also
records this fact in an assert stack, assert_start pushes onto the stack
and assert_end pops it.  The assert stack tracks where each assert was
called.  If an assert fails anywhere in the stack it can get the file and
line number information from the top of the stack.


=head2 Multiple Stacks

Currently there is only one stack, see C<top_stack> in TB2.  This must be
developed into multiple stacks to handle some use cases, the most important
are L<Test::Warn> and L<Test::Exception>.

For example:

    #line 1
    warnings_is {
        is( $foo, $bar);
    } "something";

If the C<is> called inside C<warnings_is()> fails, it should report
diagnostics from line 2, not line 1 where C<warnings_is()>  is called.  In
addition, C<is()> failing should not result in C<warnings_is()> failing.

There is no way for TB2 to infer this special case, it must be declared by
the author of C<warnings_is>.  C<warnings_is> must tell TB2 that it should
save its current assert stack and start a new one.

    use TB2::Module;
    
    install_test warnings_is => sub (&$) {
        my($code, $warning) = @_;
        
        ...set up capturing warnings however...
        
        # Declare a new assert stack
        Builder->start_assert_stack;
        
        # Run the code with that new stack
        $code->();
        
        # End the assert stack and go back to the old one
        Builder->end_assert_stack;
        
        # Run an assert to check the warnings using the original stack
        return is( $captured_warnings, $warning );
    };

In effect, there is a stack of stacks maintained inside TB2.
C<Builder> is provided by TB2::Module as a shortcut for
C<< $class->builder >>.

This will also allow tests to work in cooperative multitasking situations
such as POE.  One stack of asserts may start running only to yield control
to another stack.  The details are beyond the scope of this design, only
that it is made possible.


=head2 Multiple asserts inside an assert

There is a final case to consider, this:

    install_test file_contents_ok => sub {
        my($file, $want) = @_;

        ok( open(my $fh, "<", $file), "open $file" );
        my $content = join "", <$fh>;
        ok( close $fh, "close $file" );

        return is $content, $want, "contents of $file";
    };

This assert has multiple asserts inside it, but the final one is the
important one.  In this case, TB2 must display them all as if they came
from the point where C<< file_contents_ok() >> was called.  Also make sure
that the results of the two C<ok()>s get output.  And do it all in the
right order.  This is an open problem.


=head1 Mouse

TB2 uses Mouse, which is the Moose interface without dependencies and
sluggishness.  It takes a great risk in relying on a complicated module. 
The decision was made on the speculation that if TB2 used a real object
system, that might allow the design to go in interesting directions not
otherwise easily available.

Two user visible design features have come out of this.  The first is that
TB2's event system, rather than having explict event callbacks, is modelled
by simply wrapping public TB2 methods using Mouse method wrappers.  This
greatly simplifies designing and implementing test events and provides a
more flexible system since users can safely wrap any public method rather
than waiting for TB2 to add a hook.  Hook points fall naturally out of
decomposing the steps of handling results.

The second is the ability to compose TB2 with roles.  Rather than adding
functionality by subclassing, it can be added with roles.  Subclassing is
untenable in the long run.  Test::A will want to use their TB2 subclass
while Test::B will want to use its own TB2 subclass.  They can't both have
the default.  Rather than come up with increasingly complicated ways to
reconcile this, users can add functionality to TB2 with roles applied to
the TB2 default.  Roles add, rather than modify, functionality. 
Collisions will only occur in method names and will be very clear, at about
the same level of risk as exporting a function.

Roles also let TB2 shed much of TB1's functionality, leaving it to roles. 
For example, most of the helper assert methods in TB1 like C<is_eq> and
C<like> are not present in core TB2 making it much simpler.  They will be
in something like TB2::More::Asserts which will probably be composed in by
default.

Roles and wrapping methods allow TB2 to remain a default while being
extensible by multiple authors without explicit coordination.

=head2 Mouse Risk Mitgation

TB2 can have no dependencies, so it ships with its own copy of Mouse.

Mouse is a large, complicated system and TB2 has already hit bugs.  It has
also had breakages from one version of Mouse to the next.  In order to
avoid this, TB2 will ONLY use its shipped copy of Mouse.  That is, TB2 will
ship with a copy of Mouse matched to that particular release.

Finally, to avoid stepping on the installed copy of Mouse, TB2 will ship
its version of Mouse as TB2::Mouse with all internal packages similarly
changed.  This will avoid colliding with Mouse both on disk and in memory.

=head2 Mouse In The Core?

If TB2 ships with Mouse, and TB2 ships in the core of Perl... does that
mean core Perl will ship Mouse?  No, it is not required that Perl ship
Mouse.  TB2 will require that core Perl ships TB2::Mouse, but as such it is
an internal module of TB2 and should not be used by the public.  Making
Mouse publicly available in the core is a separate issue and would, in
fact, require a separate copy of Mouse anyway.


=head1 Overview

=head2 Test::Builder

Here's a diagram of the "flow" of assert results through Test::Builder
version 1.

                     .-------.
                     | foo.t |
                     '-------'
                         |
                         |
     .-------------.     |     .----------------.
     | Test::More  |<--------->| Test::Whatever |
     '-------------'           '----------------'
            |                           |
            |                           |
            |                           |
            |     .---------------.     |
            '---->| Test::Builder |<----'
                  '---------------'
                          |
                          v
                       .-----.
                       | TAP |
                       '-----'
                          |
                          v
                  .---------------.
                  | Test::Harness |
                  '---------------'

You write F<foo.t> using Test::More and Test::Whatever.  These both
use the same Test::Builder object.  It spits out TAP which
Test::Harness converts into something human readable.

The big problem there is Test::Builder is monolithic.  There's no
further breakdown of responsibilities.  It only spits out TAP, and
only one version of TAP.


=head2 Test::Builder2

Here's what Test::Builder2 looks like:

                                  .-------.
                 .----------------| foo.t |----------------.
                 |                '-------'                |
                 |                    |                    |
                 |                    |                    |
                 v                    v                    v
          .------------.     .----------------.     .------------------.
          | Test::More |     | Test::Whatever |     | Test::NotUpdated |
          '------------'     '----------------'     '------------------'
                 |                |                       |
                 |                v                       v
                 |       .----------------.       .---------------.
                 '------>| Test::Builder2 |       | Test::Builder |
                         '----------------'       '---------------'
                                      |                  |
                                      v                  |
                               .-------------.           |
                               | TB2::Result |<----------'
                               '-------------'
                                      |
                                      v
        .--------------.    .-----------------------.    
        | TB2::History |<---| TB2::EventCoordinator |
        '--------------'    '-----------------------'
                                      |
    .--------------------------.      |       .---------------------.
    | TB2::Formatter::TAP::v13 |<-----'------>| TB2::Formatter::GUI |
    '--------------------------'              '---------------------'
                  |                                      |
                  v                                      |
  .-------------------------------.                      |
  | TB2::Formatter::Streamer::TAP |                      |
  '-------------------------------'                      |
                  |                                      |
                  v                                      |
               .-----.                                   |
               | TAP |                                   |
               '-----'                                   |
                  |                                      |
                  v                                      v
          .---------------.                     .-----------------.
          | Test::Harness |                     | Pretty Pictures |
          '---------------'                     '-----------------'

It starts out the same, F<foo.t> uses a bunch of test modules
including Test::More and Test::Whatever using the same Test::Builder2
object, but it also uses Test::NotUpdated which is still using
Test::Builder.  That's ok because Test::Builder has been rewritten in
terms of Test::Builder2 (more on that below).

Test::Builder2, rather than being a monolith, produces a
TB2::Result object for each assert run.  This gets handed
to the EventCoordinator which hands the Result off to the History,
Formatter and any other handlers.  Test::Builder and Test::Builder2
share the same EventCoordinator and thus the same Formatter and
History.

History records events and results for possible later use.  The
Formatter turns events and results into formatted output, by default
TAP.

Because Test::Builder2 is not monolithic, you can swap out parts.  For
example, instead of outputting TAP it could instead hand results to a
formatter that produced a simple GUI representation, maybe a green
bar, or something that hooks into a larger GUI.  Or maybe one that
produces JUnit XML.

=head2 How Test::Builder and Test::Builder2 Relate

        .-----.                                         .-----.
        | TB2 |                                         | TB1 |
        '-----'                                         '-----'
           |                                               |
           |                                               |
           v                                               v
    .-------------.    .-----------------------.     .-------------.
    | TB2::Result |--->| TB2::EventCoordinator |<----| TB2::Result |
    '-------------'    '-----------------------'     '-------------'
                                   |
                                   v
                          .----------------.
                          | TB2::Formatter |
                          '----------------'
                                   |
                                   v
                              .--------.
                              | Output |
                              '--------'

Test::Builder and Test::Builder2 coordinate their actions by sharing
the same EventCoordinator which carries around History and Formatter
objects.  If you call TB1->ok() it produces a Result object which it
hands to the EventCoordinator which hands it to its History and
Formatter objects.  If you call TB2->ok() it produces a Result object
which it hands to the same EventCoordinator.

This allows most of the Test::Builder code to remain the same while
still coordinating with Test::Builder2.  It also allows radically
different builders to be made without Test::Builder2 dictating how
they're to work.

If you want to add behaviors to the tests, rather than extending or
altering the Builder you add an EventHandler to the EventCoordinator.
This will be informed when thing happen in the test, can take action
and even alter the events and results.


=head1 Glossary

=head2 TB1

Refers to L<Test::Builder>.

=head2 TB2

For brevity's sake, Test::Builder2 will be referred to as B<TB2>.

=head2 builder

Refers to the TB1 or TB2 object central to Perl's testing system.  It
serves to abstract the details of writing a test library.  A builder
defines how asserts are written, contains the event coordinator and is
responsible for making sure its asserts pass along events to the
coordinator.

=head2 assert

An assertion is a single statement which is tested.  It corresponds to
a traditional C<ok> function call.

=head2 event

Anything which happens over the course of the test will generate an
event.  Examples include starting the test, changing the test plan,
the result of an assert, and so on.  Events fully describe the test
suite and they are the B<only> information used by formatters.

=head2 result

The information from a single assert.  Includes things like if it
passed or failed, if it had any directives, its file and line number,
and any additional user specified diagnostics.

A result is a special type of event.

=head2 event coordinator

This object gathers events from all the builders that wish to work
together in the same test suite and passes them along to its event
handlers.

=head2 event handler

An event handler is a thing which does things with events (yes, it's
supposed to be vague).  Examples include formatters and history.
Custom event handlers can also be used to alter the behavior of tests (for
example, die on assert failure) and even the event itself.

Event handlers get their events through an event coordinator.

=head2 directive

A flag which modifies the result in some way.  For example, C<skip>
says that the test was skipped.  C<todo> means the test was expected to
fail.

=head2 diagnostics

Additional structured information about a test result.  For example,
what file and line number it occurred on.  Users can add their own
diagnostics to a test result.

Test::Builder's "diagnostics" (output by C<diag()>) are actually
L<comments> which were used to provide diagnostics.  These were
unparsable.

=head2 comment

A piece of information in the test that may be parsed but is
ignored and has no effect on the result of the test.

=head2 note

A comment which is not normally shown to the user, put there for
debugging and informational purposes.

=head2 test

The output from a single test unit.  In a traditional Perl system this
is the output from a single F<.t> file.

=head2 suite

The complete set of tests run by a project.  In a traditional Perl
system this is all the F<.t> files.

=head2 formatter

Takes the abstract result and turns it into parsable output.  For
example, L<TB2::Formatter::TAP> turns test results into the
familiar TAP.

See L<TB2::Formatter> for details.

=head2 streamer

Takes the formatted text and outputs it, usually to STDOUT and
STDERR but it may instead capture it for debugging purposes, or send 
it as an email, or write it to a file, or do all of these things.

A formatter contains a streamer and use it to output its formatted
text.

See L<TB2::Streamer> for details.

=head2 TAP

Test Anything Protocol, the name for the usual C<ok 1> output you see
from most Perl tests.

=head2 test

An ambiguous term often used to mean an assert, a file full of
asserts or a suite.  We'll avoid using it without qualification.

=cut