Nathan Gary Glenn > Algorithm-AM > Algorithm::AM::Result

Download:
Algorithm-AM-3.05.tar.gz

Dependencies

Annotate this POD

Website

View/Report Bugs
Module Version: 3.05   Source  

NAME ^

Algorithm::AM::Result - Store results of an AM classification

VERSION ^

version 3.05

SYNOPSIS

  use Algorithm::AM;

  my $am = Algorithm::AM->new('finnverb', -commas => 'no');
  my ($result) = $am->classify;
  print @{ $result->winners };
  print $result->statistical_summary;

DESCRIPTION

This package encapsulates all of the classification information generated by "classify" in Algorithm::AM, including the assigned class, score to each class, gang effects, analogical sets, and timing information. It also provides several methods for generating printable reports with this information.

Note that the words 'score' and 'point' are used here to represent whatever count is assigned by analogical modeling during classification. This can be either pointers or occurrences. For an explanation of this, see Algorithm::AM::algorithm.

All of the scores returned by the methods here are scalars with special PV and NV values. You should excercise caution when doing calculations with them. See Algorithm::AM::BigInt for more information.

REPORT METHODS ^

The methods below return human eye-friendly reports about the classification. The return value is a reference, so it must be dereferenced for printing like so:

 print ${ $result->statistical_summary };

config_info

Returns a scalar (string) ref containing information about the configuration at the time of classification. Information from the following accessors is included:

    exclude_nulls
    given_excluded
    cardinality
    test_in_train
    test_item
    count_method

statistical_summary

Returns a scalar reference (string) containing a statistical summary of the classification results. The summary includes all possible predicted classes with their scores and percentage scores and the total score for all classes. Whether the predicted class is correct/incorrect/a tie of some sort is also included, if the test item had a known class.

analogical_set

The analogical set is the set of items from the training set that had some effect on the item classification. The analogical effect of an item in the analogical set is the score it contributed towards a classification matching its own class label.

This method returns the items in the analogical set along with their analogical effects, in the following structure:

 { 'item_id' => {'item' => item, 'score' => score}

item above is the actual item object. The item_id is used so that the analogical effect of a particular item can be found quickly:

 my $set = $result->analogical_set;
 print 'the item's analogical effect was '
     . $set->{$item->id}->score;

analogical_set_summary

Returns a scalar reference (string) containing the analogical set, meaning all items that contributed to the predicted class, along with the amount contributed by each item (score and percentage overall). Items are ordered by appearance in the data set.

gang_effects

Return a hash describing gang effects. Gang effects are similar to analogical sets, but the total effects of entire subcontexts and supracontexts are also calculated and printed.

TODO: details, details! Maybe make a gang class to hold this structure.

gang_summary

Returns a scalar reference (string) containing the gang effects on the final class prediction.

A single boolean parameter can be provided to turn on list printing, meaning gang items items are printed. This is false (off) by default.

CONFIGURATION INFORMATION ^

The following methods provide information about the configuration of AM at the time of classification.

exclude_nulls

Set to the value given by the same method of Algorithm::AM at the time of classification.

given_excluded

Set to the value given by the same method of Algorithm::AM at the time of classification.

cardinality

The number of features used during classification. If there were null feature values and "exclude_nulls" was set to true, then this number will be lower than the cardinality of the utilized data sets.

test_in_train

True if the test item was present among the training items.

test_item

Returns the item which was classified.

count_method

Returns either "linear" or "squared", indicating the setting used for computing analogical sets. See "linear" in Algorithm::AM.

training_set

Returns the data set which was the source of classification data.

RESULT DETAILS ^

The following methods provide information about the results of the classification.

result

If the class of the test item was known before classification, this returns "tie", "correct", or "incorrect", depending on the label assigned by the classification. Otherwise this returns undef.

random_outcome

This returns one of the class labels predicted for the test item. The choice is done probabilistically, with the probability of each value given by its normalized score.

For a given result object, the return value of this method never changes; the value is only chosen once.

high_score

Returns the highest score assigned to any of the class labels.

scores

Returns a hash mapping all predicted classes to their scores.

scores_normalized

Returns a hash mapping all predicted classes to their score, divided by the total score for all classes. For example, if the "scores" method returns the following:

 {'e' => 4, 'r' => 9}

then this method would return the following (values below are rounded):

 {'e' => 0.3076923, 'r' => 0.6923077}

winners

Returns an array ref containing the classes which had the highest score. There is more than one only if there is a tie for the highest score.

is_tie

Returns true if more than one class was assigned the high score.

total_points

The sum total number of points assigned as a score to any contexts.

start_time

Returns the start time of the classification.

end_time

Returns the end time of the classification.

AUTHOR ^

Theron Stanford <shixilun@yahoo.com>, Nathan Glenn <garfieldnate@gmail.com>

COPYRIGHT AND LICENSE ^

This software is copyright (c) 2013 by Royal Skousen.

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.

syntax highlighting: