The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Acme::Wabby - Create semi-random sentences based upon a body of text.

SYNOPSIS

  use Acme::Wabby qw(:errors);

  # Use the default options
  my $wabby = Acme::Wabby->new;

  # Pass in explicit options. (All options below are defaults)
  my $wabby = Acme::Wabby->new( min_len => 3, max_len => 30,
      punctuation => [".","?","!","..."], case_sensitive => 1,
      hash_file => "./wabbyhash.dat", list_file => "./wabbylist.dat",
      autosave_on_destroy => 0, max_attempts => 1000 );

  # Save the current state to the configured files
  $wabby->save;

  # Load a saved state from the configured files
  $wabby->load;

  # Add some text to the current state
  $wabby->add($the_complete_works_of_shakespeare);

  # Generate a random sentence
  print $wabby->spew, "\n";

  # Generate a random sentence, beginning with "The"
  print $wabby->spew("Romeo and Juliet"), "\n";

  # Produce a string containing some info about the current state
  print scalar($wabby->stats), "\n";

  # Produce a list containing the word count and average connection count
  my ($wordcount, $average) = $wabby->stats;
  print "Wabby knows $wordcount words, with an average number of"
      ."connections between each word of $average\n";

DESCRIPTION

This module is used to create semi-random sentences based on a body of text. It uses a markov-like method of storing probabilities of word transitions. It is good for annoying people on IRC, AIM, or other such fun mediums.

Acme::Wabby only provides an object-oriented interface, and exports no symbols into the caller's namespace. Each object is self-contained, so there are no issues with creating and using multiple objects from within the same calling program.

Creating an object

To begin using Acme::Wabby you must first create a new object:

  my $wabby = Acme::Wabby->new(min_len => 3, max_len => 30,
      punctuation => [".","?","!","..."], case_sensitive => 1,
      hash_file => "./wabbyhash.dat", list_file => "./wabbylist.dat",
      autosave_on_destroy => 0, max_attempts => 1000 );

All configuration values passed to the object constructor are optional, and have sensible defaults. The following is a description of the parameters and their default values.

min_len

The minimum length for a generated sentence. (3)

max_len

The maximum length for a generated sentence. (30)

punctuation

A reference to an array containing possible punctuation with which to end sentences. ([".","?","!","..."])

case_sensitive

Whether or not to treat text in a case sensitive manner. (1)

hash_file

The file to/from which the hash data will be stored/loaded if requested. ("./wabbyhash.dat")

list_file

The file to/from which the list data will be stored/loaded if requested. ("./wabbylist.dat")

autosave_on_destroy

Whether or not to automatically save the state upon object destruction. (0)

max_attempts

The maximum number of attempts to create a sentence before giving up. (1000)

Adding text to the state

To have an amusing experience, you will need to feed the object a body of text. This text can come from virtually any source, although I enjoy using e-Texts from the good folks at Project Gutenberg (http://promo.net/pg). To add text to the state, simply call the add() method on the object, passing it a scalar containing the text.

  $wabby->add($complete_works_of_shakespeare);

It is acceptable for the input text to contain embedded newlines or other such things. It is acceptable to call the add() method many times, and at any point in the object's life-span. The add() method will return undef upon error, and true upon success.

Generating random sentences

Once you have some text loaded into the object, you can generate random sentences. To do this, we use the spew() method. The spew() method has two modes of operation: If no argument is given, it will generate and return a random sentence. If a single string is passed in, it will generate and return a random sentence beginning with the provided string.

  my $random_sentence = $wabby->spew;
  my $not_so_random_sentence = $wabby->spew("Romeo and Juliet");

The spew() method will return the generated string, or undef upon error. There are several error conditions which can occur in the spew() method. None of them are fatal, but they must be taken into account by the calling program. They are:

* At least (min_len * 10) words haven't been run through yet. (Must add() more text before trying again.)

* A string was passed in containing nothing. (Don't do that.)

* We don't know the last word in the string passed in, and can therefore not generate a sentence with it. (Either teach us about it with add(), or try something else.)

* A sentence of at least min_len words could not be generated, even after max_attempts tries at doing so. (Likely need to add() more text before trying again.)

Saving / loading state

Acme::Wabby can save and load state to disk using the Storable module. To do this, simply use the save() and/or load() methods.

  $wabby->save;
  $wabby->load;

These methods take no arguments, they simply save or load the state to or from the file names which were defined when the object was created. Loading a saved state is much faster than re-parsing a large body of text.

Getting statistics

Using the stat() method will provide you with some simple statistics about the current state of an object. When used in a scalar context, the stat() method will return a string containing a description of what the object knows. When used in a list context, it will return a list of two numbers. The first entry in the list is the number of words that the object knows. The second entry in the list is the average number of connections between words.

  my ($wordcount, $average) = $wabby->stats;
  print "count=$wordcount, average=$average\n";
  print scalar($wabby->stats), "\n";

BUGS

 * Uses a lot of memory (not so much a bug as an implementation quirk).

TODO

 * Be better about normalizing input text.
 * Fix english assumtions about single-letter words besides I and a.
 * See about making the parsing into phrases and words more configurable.
 * Investigate using longer-order chains to improve generation quality.
 * Try to use less memory!

AUTHOR

Nathan Poznick <kraken@wang-fu.org>

CREDITS

 nick@misanthropia.nu - for writing the original wabbylegs.pl
 Project Gutenberg - for providing free text to feed to Acme::Wabby.

COPYRIGHT

Copyright (c) 2004, Nathan Poznick. All rights reserved. This program is free software; you can redistribute it and/or modify it under the terms of the GPL version 2.