The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.
NAME
    Algorithm::AM - Classify data with Analogical Modeling

VERSION
    version 3.05

SYNOPSIS
     use Algorithm::AM;
     my $dataset = dataset_from_file('finnverb');
     my $am = Algorithm::AM->new(training_set => $dataset);
     my $result = $am->classify($dataset->get_item(0));
     print @{ $result->winners };
     print ${ $result->statistical_summary };

DESCRIPTION
    This module provides an object-oriented interface for classifying single
    items using the analogical modeling algorithm. To work with sets of
    items needing to be classified, see Algorithm::AM::Batch.

    This module logs information using Log::Any, so if you want automatic
    print-outs you need to set an adaptor. See the "classify" method for
    more information on logged data.

BACKGROUND AND TERMINOLOGY
    Analogical Modeling (or AM) was developed as an exemplar-based approach
    to modeling language usage, and has also been found useful in modeling
    other "sticky" phenomena. AM is especially suited to this because it
    predicts probabilistic occurrences instead of assigning static labels to
    instances.

    The AM algorithm can be called a probabilistic
    <http://en.wikipedia.org/wiki/Probabilistic_classification>,
    instance-based <http://en.wikipedia.org/wiki/Instance-based_learning>
    classifier. However, the probabilities given for each classification are
    not degrees of certainty, but actual probabilities of occurring in real
    usage. Thus in AM literature the classification is supposed to produce
    dynamic "outcomes", not static "labels". In AM proper, the last step of
    classification is to produce an outcome at random based on the
    calculated probability distribution. AM therefore predicts that "sticky"
    phenomena are "sticky" because they vary probabilistically, defying
    absolute prediction.

    In this software, an outcome can be chosen probabilistically using
    "random_outcome" in Algorithm::AM::Result. However, in practice, usually
    only the highest-probability prediction(s) are used for classification
    tasks. These can be retrieved via "winners" in Algorithm::AM::Result, or
    "result" in Algorithm::AM::Result if you're just interested in
    classification accuracy on a test set. The entire outcome probability
    distribution can also be retrieved via "scores_normalized" in
    Algorithm::AM::Result. See Algorithm::AM::Result for other types of
    information available after classification. See Algorithm::AM::algorithm
    for details on the actual mechanism of classification.

    Outside of the "random_outcome" method mentioned above, the rest of the
    software uses more general machine learning terminology. What would
    properly be called an "exemplar" is referred to simply as an "item",
    and, as is customary, "training" and "test" sets are used, even though
    AM never does any actual "training". Training items are assigned "class
    labels" (not "outcomes"), and classification results in a set of scores
    (or probabilities) for different "class labels", even though they would
    properly be called "outcomes". Finally, items contain vectors of
    "features", which were called "variables" in previous versions of this
    software.

EXPORTS
    When this module is imported, it also imports the following:

    Algorithm::AM::Result
    Algorithm::AM::DataSet
        Also imports "dataset_from_file" in Algorithm::AM::DataSet.

    Algorithm::AM::DataSet::Item
        Also imports "new_item" in Algorithm::AM::DataSet::Item.

    Algorithm::AM::BigInt
        Also imports "bigcmp" in Algorithm::AM::BigInt.

METHODS
  "new"
    Creates a new instance of an analogical modeling classifier. This method
    takes named parameters which set state described in the documentation
    for the relevant methods. The only required parameter is "training_set",
    which should be an instance of Algorithm::AM::DataSet, and which defines
    the set of items used for training during classification. All of the
    accepted parameters are listed below:

    "training_set"
    "exclude_nulls"
    "exclude_given"
    "linear"

  "training_set"
    Returns (but will not set) the dataset used for training. This is an
    instance of Algorithm::AM::DataSet.

  "exclude_nulls"
    Get/set a boolean value indicating whether features with null values in
    the test item should be ignored. If false, they will be treated as
    having a specific value representing null. Defaults to true.

  "exclude_given"
    Get/set a boolean value indicating whether the test item should be
    removed from the training set if it is found there during
    classification. Defaults to true.

  "linear"
    Get/set a boolean value indicating whether the analogical set should be
    computed using *occurrences* (linearly) or *pointers* (quadratically).
    To understand what this means, you should read the algorithm page. A
    false value indicates quadratic counting. Defaults to false.

  "classify"
      $am->classify(new_item(features => ['a','b','c']));

    Using the analogical modeling algorithm, this method classifies the
    input test item and returns a Result object.

    Log::Any is used for logging. The full classification configuration is
    logged at the info level. A notice is printed at the warning level if no
    training items can be compared with the test item, preventing any
    classification.

HISTORY
    Initially, Analogical Modeling was implemented as a Pascal program.
    Subsequently, it was ported to Perl, with substantial improvements made
    in 2000. In 2001, the core of the algorithm was rewritten in C, while
    the parsing, printing, and statistical routines remained in C; this was
    accomplished by embedding a Perl interpreter into the C code.

    In 2004, the algorithm was again rewritten, this time in order to handle
    more features and large data sets. The algorithm breaks the
    supracontextual lattice into the direct product of four smaller ones,
    which the algorithm manipulates individually before recombining. These
    lattices can be manipulated in parallel when using the right hardware,
    and so the module was named "AM::Parallel". This implementation was
    written with the core lattice-filling algorithm in XS, and hooks were
    provided to help the user create custom reports and control
    classification dynamically.

    The present version has been renamed to "Algorithm::AM", which seemed a
    better fit for CPAN. While the XS has largely remained intact, the Perl
    code has been completely reorganized and updated to be both more
    "modern" and modular. Most of the functionality of "AM::Parallel"
    remains.

SEE ALSO
    The <home page|http://humanities.byu.edu/am/> for Analogical Modeling
    includes information about current research and publications, as well as
    sample data sets.

    The Wikipedia article <http://en.wikipedia.org/wiki/Analogical_modeling>
    has details and even illustrations on analogical modeling.

SUPPORT
  Bugs / Feature Requests
    Please report any bugs or feature requests through the issue tracker at
    <https://github.com/garfieldnate/Algorithm-AM/issues>. You will be
    notified automatically of any progress on your issue.

  Source Code
    This is open source software. The code repository is available for
    public review and contribution under the terms of the license.

    <https://github.com/garfieldnate/Algorithm-AM>

      git clone https://github.com/garfieldnate/Algorithm-AM.git

AUTHOR
    Theron Stanford <shixilun@yahoo.com>, Nathan Glenn
    <garfieldnate@gmail.com>

CONTRIBUTORS
    *   garfieldnate <garfieldnate@gmail.com>

    *   Nathan Glenn <garfieldnate@gmail.com>

    *   Nick <nlogan@gmail.com>

COPYRIGHT AND LICENSE
    This software is copyright (c) 2013 by Royal Skousen.

    This is free software; you can redistribute it and/or modify it under
    the same terms as the Perl 5 programming language system itself.