The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.
NAME
    Statistics::Contingency - Calculate precision, recall, F1, accuracy,
    etc.

SYNOPSIS
     use Statistics::Contingency;
     my $s = new Statistics::Contingency(categories => \@all_categories);
 
     while (...something...) {
       ...
       $s->add_result($assigned_categories, $correct_categories);
     }
 
     print "Micro F1: ", $s->micro_F1, "\n"; # Access a single statistic
     print $s->stats_table; # Show several stats in table form

DESCRIPTION
    The `Statistics::Contingency' class helps you calculate several useful
    statistical measures based on 2x2 "contingency tables". I use these
    measures to help judge the results of automatic text categorization
    experiments, but they are useful in other situations as well.

    The general usage flow is to tally a whole bunch of results in the
    `Statistics::Contingency' object, then query that object to obtain the
    measures you are interested in. When all results have been collected,
    you can get a report on accuracy, precision, recall, F1, and so on, with
    both macro-averaging and micro-averaging over categories.

  Macro vs. Micro Statistics

    All of the statistics offered by this module can be calculated for each
    category and then averaged, or can be calculated over all decisions and
    then averaged. The former is called macro-averaging (specifically,
    macro-averaging with respect to category), and the latter is called
    micro-averaging. The two procedures bias the results differently -
    micro-averaging tends to over-emphasize the performance on the largest
    categories, while macro-averaging over-emphasizes the performance on the
    smallest. It's often best to look at both of them to get a good idea of
    how your data distributes across categories.

  Statistics available

    All of the statistics are calculated based on a so-called "contingency
    table", which looks like this:

                  Correct=Y   Correct=N
                +-----------+-----------+
     Assigned=Y |     a     |     b     |
                +-----------+-----------+
     Assigned=N |     c     |     d     |
                +-----------+-----------+

    a, b, c, and d are counts that reflect how the assigned categories
    matched the correct categories. Depending on whether a macro-statistic
    or a micro-statistic is being calculated, these numbers will be tallied
    per-category or for the entire result set.

    The following statistics are available:

    * accuracy
        This measures the portion of all decisions that were correct
        decisions. It is defined as `(a+d)/(a+b+c+d)'. It falls in the range
        from 0 to 1, with 1 being the best score.

        Note that macro-accuracy and micro-accuracy will always give the
        same number.

    * error
        This measures the portion of all decisions that were incorrect
        decisions. It is defined as `(b+c)/(a+b+c+d)'. It falls in the range
        from 0 to 1, with 0 being the best score.

        Note that macro-error and micro-error will always give the same
        number.

    * precision
        This measures the portion of the assigned categories that were
        correct. It is defined as `a/(a+b)'. It falls in the range from 0 to
        1, with 1 being the best score.

    * recall
        This measures the portion of the correct categories that were
        assigned. It is defined as `a/(a+c)'. It falls in the range from 0
        to 1, with 1 being the best score.

    * F1
        This measures an even combination of precision and recall. It is
        defined as `2*p*r/(p+r)'. In terms of a, b, and c, it may be
        expressed as `2a/(2a+b+c)'. It falls in the range from 0 to 1, with
        1 being the best score.

    The F1 measure is often the only simple measure that is worth trying to
    maximize on its own - consider the fact that you can get a perfect
    precision score by always assigning zero categories, or a perfect recall
    score by always assigning every category. A truly smart system will
    assign the correct categories and only the correct categories,
    maximizing precision and recall at the same time, and therefore
    maximizing the F1 score.

    Sometimes it's worth trying to maximize the accuracy score, but accuracy
    (and its counterpart error) are considered fairly crude scores that
    don't give much information about the performance of a categorizer.

METHODS
    The general execution flow when using this class is to create a
    `Statistics::Contingency' object, add a bunch of results to it, and then
    report on the results.

    * $e = Statistics::Contingency->new()
        Returns a new `Statistics::Contingency' object. Expects a
        `categories' parameter specifying the entire set of categories that
        may be assigned during this experiment. Also accepts a `verbose'
        parameter - if true, some diagnostic status information will be
        displayed when certain actions are performed.

    * $e->add_result($assigned_categories, $correct_categories, $name)
        Adds a new result to the experiment. The lists of assigned and
        correct categories can be given as an array of category names
        (strings), as a hash whose keys are the category names and whose
        values are anything logically true, or as a single string if there
        is only one category.

        If you've already got the lists in hash form, this will be the
        fastest way to pass them. Otherwise, the current implementation will
        convert them to hash form internally in order to make its
        calculations efficient.

        The `$name' parameter is an optional name for this result. It will
        only be used in error messages or debugging/progress output.

        In the current implementation, we only store the contingency tables
        per category, as well as a table for the entire result set. This
        means that you can't recover information about any particular single
        result from the `Statistics::Contingency' object.

    * $e->set_entries($a, $b, $c, $d)
        If you don't wish to use the c<add_result()> interface, but still
        take advantage of the calculation methods and the various edge cases
        they handle, you can directly set the four elements of the
        contingency table with this method.

    * $e->micro_accuracy
        Returns the micro-averaged accuracy for the data set.

    * $e->micro_error
        Returns the micro-averaged error for the data set.

    * $e->micro_precision
        Returns the micro-averaged precision for the data set.

    * $e->micro_recall
        Returns the micro-averaged recall for the data set.

    * $e->micro_F1
        Returns the micro-averaged F1 for the data set.

    * $e->macro_accuracy
        Returns the macro-averaged accuracy for the data set.

    * $e->macro_error
        Returns the macro-averaged error for the data set.

    * $e->macro_precision
        Returns the macro-averaged precision for the data set.

    * $e->macro_recall
        Returns the macro-averaged recall for the data set.

    * $e->macro_F1
        Returns the macro-averaged F1 for the data set.

    * $e->stats_table
        Returns a string combining several statistics in one graphic table.
        Since accuracy is 1 minus error, we only report error since it takes
        less space to print. An optional argument specifies the number of
        significant digits to show in the data - the default is 3
        significant digits.

    * $e->category_stats
        Returns a hash reference whose keys are the names of each category,
        and whose values contain the various statistical measures (accuracy,
        error, precision, recall, or F1) about each category as a hash
        reference. For example, to print a single statistic:

         print $e->category_stats->{sports}{recall}, "\n";

        Or to print certain statistics for all categtories:

         my $stats = $e->category_stats;
         while (my ($cat, $value) = each %$stats) {
           print "Category '$cat': \n";
           print "  Accuracy: $value->{accuracy}\n";
           print "  Precision: $value->{precision}\n";
           print "  F1: $value->{F1}\n";
         }

AUTHOR
    Ken Williams <kwilliams@cpan.org>

COPYRIGHT
    Copyright 2002-2008 Ken Williams. All rights reserved.

    This distribution is free software; you can redistribute it and/or
    modify it under the same terms as Perl itself.