The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

float.pl - Genetic programming front-end using Test::Float, StupidMarkov, and PPI

SYNOPSIS

float.pl is to Test::Float as prove is to Test::Harness. That is, float.pl is a command line interface to Test::Float.

  perl float.pl --help
  perl float.pl --learn /path/to/some/code
  perl float.pl --spew 20
  perl float.pl --code

WARNING! In the process of assimulating existing code and creating semi-random permutations from it, this script could easily come up with code that will ERASE YOUR DATA OR SEND INAPPROPRIATE PHOTOS TO YOUR INLAWS.

GLOSSARY

This has a number of parts. It's useful to define them before getting into arguments and usage. This also ships with a demo.

Test::Float -- hacked up Test::Harness that understands floating point test results.
float.pl -- this script; trains a Markov engine from code samples, generates semi-random random snippets, and applies a simple genetic programming algorithm using floating point test results as a fitness tests to the snippets
t/* -- internal tests that have to pass before cpanm or whatever will install Test::Float; returns ok/not ok; uninteresting
fitness-t/* -- genetic selection criteria fitness criteria tests that returning floating point values
fitness-t/goo.t -- genetic selection criteria fitness tests that do some basic sanity checking such as looking for code that passes syntax check
fitness-t/logic.t -- genetic selection critera fitness tests that inspect output on STDOUT; this test should be used as an example but otherwise REMOVED or ALTERED to be specific to test for whatever you want float.pl --code to write code to do
goo.pl -- the primary output of float.pl --code; also the current member of the current generation of genetic-Markov code samples being tested by float.pl --code; after float.pl --code finishes, the specimen with the best test score is left in place as goo.pl; the population of specimes exist primarily in memory
seq.pl -- an example starting program to output a (kind of) Fibonacci sequence of numbers; it contaisn a bug (with a comment)

seq.pl and fitness-t/logic.pl, as shipped, are part of a demonstration in automatic bug repair. seq attempts to compute (sort of) the Fibonacci sequence but contains a bug (with a comment marking it). fitness-t/logic.pl tests for the correct output of the first three in the (kind of, simplified) Fibonacci sequence. float.pl --code --from seq.pl should find and fix the bug in seq.pl, leaving a corrected version of seq.pl as goo.pl. float.pl is non-deterministic, so depending on luck, number of generations, and other parameters, may or may not arrive at a solution.

ARGUMENTS

Here are the arguments:

        --learn <dir>         -- feed .pl and .pm files in a directory into the Markov engine
        --spew <n>            -- (test) output n successive tokens from the Markov engine
        --eval <str>          -- (test) in-context eval; changes to the corpus are saved on exit
        --code                -- write a program to satisify tests

        --code options:

            --chainlength <n> -- number of tokens (program size) in each semi-randomly generated initial specimen OR:
            --from <fn.pl>    -- file to start with; implies learning from it as well as mutating it directly

            --generations <n> -- how many generations to run, max (stops early on a perfect score)

            --keep <n>        -- how many top performers of the previous generation to include in each new generation
            --breed <n>       -- how many children of the top performers to include in each new generation
            --mutate <n>      -- how many mutated children of the top performers to include in each new generation
            --new <n>         -- how many brand new, semi-random specimen to include in each new generation

Acme::State is used to preserve program state between runs. If you tell it to --learn a directory, it'll remember everything it has seen in there until you remove your ~/float.pl.state file. This allows you to learn in one invocation and then generate code in another invocation.

--from uses a program you provide as one of the first generation of specimens. This is what you want if you're using Test::Float to try to fix a bug for you in existing code rather than writing code from scratch.

--code tells the thing to try to contrive a program that passes tests with the best score possible.

--code requires a unit test that returns floating point values between 0 and 1 (inclusive) rather than ok and not ok. Genetic code specimens that do better are favored for preservation and breeding for next generations. Creating tests that describe the code you want written is critical. These live in the fitness-t/ directory.

--code can be used one of two basic ways. With a --from argument, it'll start from a pre-written script. It'll include an exact copy of that script in each generation, train the Markov engine from it, and generate an itinitial random population of similar number of tokens as it.

Without --form, the initial random population are of --chainlength tokens each, or 20 by default.

Currently, you need to cd into the Test-Float-xx directory to use the --code operation, or else you need to copy or create a fitness-t/ directory with floating point tests. Either way, you need the fitness-t/ directory and fitness tests.

Two fitness test files ship with this thing, both in the fitness-t/ directory. The first fitness test, fitness-t/goo.t, has tests to see that the program is at least a reasonable length long, passes syntax checks, isn't composed of too many comments, and a few other similar things. You may wish to keep this script as is, modify, or extend it.

The other fitness test, fitness-t/logic.t should be used as an example or demonstration only and then commented out, removed, or completely rewritten and adapted to the purpose at hand. As shipped, it tests for the first three numbers (kind of) in the Fibonacci sequence.

Numerous times each generation -- once for each specimen -- goo.pl is written out and the tests in fitness-t/ are run on it.

After --code mode finishes running, the best contender will be left in place in goo.pl.

BUGS

WARNING! In the process of assimulating existing code and creating semi-random permutations from it, this script could easily come up with code that will ERASE YOUR DATA OR SEND INAPPROPRIATE PHOTOS TO YOUR INLAWS.

I'm serious. This thing generates quasi-random code and then RUNS IT. This is STUPID.

In fact, this thing is STUPID in general -- nearly as a stupid as your average ASU undergrad. Far more intelligent genetic programming systems exist.

This program should be in Acme::.

AUTHOR

Scott Walters, <scott@slowass.net>

COPYRIGHT AND LICENSE

Copyright (C) 2010 by Scott Walters

This library is not free software; you can redistribute it and/or modify it provided you take my name off of it and accept or disclaim all responsibility for the horrible things it will inevitably do. By using this program, you agree not to use this program.

THIS PROGRAM MAKES NO WARRANTY OF FITNESS FOR ANY PURPOSE, INCLUDING THE PURPOSE OF NOT DELETING ALL OF YOUR DATA. This program is stupid and if you run it, so are you.

Do not email me and ask me to clarify the copyright license so you may include it in Debian. Let me save you the trouble: you may NOT include this program in Debian. You can include Test::Float itself, but you may not include this program.