The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Math::ES - Evolution Strategy Optimizer

SYNOPSIS

  use Math::ES;

  # New ES object 
  my $es new Math::ES (
       'genes'  => [ -100,-50, 5, 200],
       'gene_deviations' => [ 1,1,1,1],
       'max_gene_values' => [ 500, 500, 500, 500],
       'min_gene_values' => [-500,-500,-500,-500],
       'rating_function' => \&function,
                       );

  my ($value1, $ra_genes1) = $es->start();   # Start the ES
  # ... doing some other things ...
  $es->run();                                # Run the ES again

  my @best_genes2 = $es->return_best_genes();
  my $best_value2 = $es->return_best_value();

  sub function {
      my $sum;
      foreach my $x (@_) {$sum += $x**2};
      return ($sum);
  }  

DESCRIPTION

The package Math::ES provides an object orientated Evolution Strategy (ES) for function minimization. It supports multiple populations, elitism, migration, isolation, two selection schemes and self-adapting step widths.

Historical background

Evolution Programs were invented in the 1960s in Germany and USA rather simultaneously, although their inventors intentions were different: John Holland wanted to study nature's form of optimization and tried to transfer the new insights from biological and biochemical investigations; thus he invented the Genetic Algorithm (GA). On the other side, the German engineer Ingo Rechenberg was interested in practical problem solutions, and his Evolution Strategies had been successfully applied for real world problems.

For a long time, the two algorthmic groups kept themselves separated from each other. Nowadays, many conceptual ideas are mixed, so that often there is just the naming of Evolution Programs.

However, although most people dealing with optimization know GAs as tools, but the ES is rather unknown. This is weird, because the GAs traditionally use a bit string for the variable representation (and have developped special codings to overcome some shortcommings of the traditional binary system), whereas ES uses real numbers, what is most often exactly what one wants.

General algorithm

Initialization

The ES object is initialized; all populations are created and filled with individuals that are mutated copies from the input genes.

Loop start

The evolutionary process starts, i.e. the main generation loop begins.

Create children, Cross over, Mutation

The defined number of children is created; for each child the requested number of parents is selected randomly.

Then for each of the child's genes it's decided randomly which of the parent gives its gene value (together with the variance parameter).

After that, each gene's variance is varied under the influence of the variance_mutator parameter.

Finally, the new gene value is created by adding a random number from a standard distribution around 0 modified with the stepwidth_const and stepwidth_var parameters.

Rate, rank and select children

The provided function is calulated for all children and they are ordered according to the function value (increasing order). Then the selection takes place, where either the n best or the n-1 best survive (cf below).

If elitism is in use, the elite individuals are not subjected to selection but simply saved to the next generation.

Migrate and mix populations

If migration is wanted, a defined number of migrators is interchanged cyclically between the populations, i.e. in three populations migrator from population1 wanders into population2, that from population2 jumps into population3 and the from population3 fills the gap in population1.

Finally, if mixing is enabled, and the requested number of isolation generations have passed all individuals from all populations are collected and then randomly redistributed over all populations again.

Loop end

After the specified number of generations the optimization stops and the results can be investigated. The ES may be run again with the same control parameters to refine the parameters (cd SYNOPSIS).

Constructor usage and options

The basic usage may be taken from the SYNOPSIS and is rather self-explanatory. However, in the following the default parameters are presented together with a more detailed description of their meaning.

Apart from the genes, gene_deviations, max_gene_values, min_gene_values and rating_function, which must be supplied by the user, the following constructor lists all attributes with their default values.

  my $es = new Math::ES (
        'debug' => 0,
        'log'   => 1,

        'log_file' => 'es-xx.log',               
        'dbg_file' => 'es-xx.dbg',               

        'individuals'          => 5,
        'parents'              => 2,
        'children'             => 10,
        'populations'          => 2,
        'selection_scheme'     => 1,
        'elite'                => 1,
        'generations'          => 50,
                               
        'stepwidth_const'      => 1,
        'stepwidth_var'        => 1.5,
        'variance_mutator'     => 0.5,
                               
        'migrators'            => 1,
        'isolation'            => 25,           
                               
        'genes'                => [],
        'max_gene_values'      => [],
        'min_gene_values'      => [],
        'gene_deviations'      => [],
        'max_gene_deviations'  => [],
        'min_gene_deviations'  => [],
        'rating_function'      => '',
                          );
debug

Debug flag for TONS of debug output; this goes to file es-1.dbg for first ES object.

log

If true the ES prints out the best genes and function values for each population and generation. The file name is es-1.log for first ES object.

log_file

The name of the log file. Unless overridden by user, this is automatically set to es-xx.log, with xx being an internal counter of how many ES objects have been created in total.

dbg_file

The name of the debug file; follows the same idea as the log file.

individuals

This determines the number of individuals in a population, i.e. the starting number and the number of children to survive.

parents

The number of individuals used to generate a child using cross over. If parents=1 then no cross over is performed, the genes are just copied from the parent; otherwise, for each gene and its deviation the respective parent is determined randomly.

children

The number of children in a population created in each generation. The smaller the ratio of individuals to children, the higher is the evolutionary pressure.

populations

The number of (independent) populations to be created.

selection_scheme

The selction scheme to be applied. There are two schemes implemented at the moment:

1: The n best children survive (n beeing the number of individuals).
2: The n-1 best children survive; the last child is choosen randomly.
elite

The number of best individuals to be kept apart from selection; e.g. if elite=1 this means that the best individuum out of the combination of parents and children is taken into the next generation.

generations

The number of cycles the optimization will run. Note that at the moment, this is the only ending criterion for an simulation.

stepwidth_const

The s_c parameter for determination of the mutation step (see previous section for details).

stepwidth_var

The s_v parameter for determination of the mutation step (see previous section for details).

variance_mutator

The parameter for mutating the gene variances.

migrators

The number of migrators (individuals) to be exchanged between isolated populations in each generation (after mutation and selection).

isolation

The number of generations the populations stay isolated (apart from possible migrators). If greater than zero, after the respecive number of generations all individuals are collected and randomly redistributed over the populations.

rating_function

This must be a reference to the rating function that takes the array of the genes as argument and returns the function value as simple scalar variable. The ES will try to minimize this function.

genes

Array of the parameters to be minimized.

gene_deviations

Array of the variances of the parameters used for the mutation.

max_gene_values and min_gene_values

Arrays of parameter boundaries.

'max_gene_deviations' and 'min_gene_deviations'

Arrays of the gene deviation boundaries. Unless set, the deviations may increase to rather unintended heights.

Other methods

The following object methods are available within the ES object:

start()

This method builds up the populations and then does the computation; it initializes and runs the ES. It returns the best function value found and a reference to an array with the best variables of the last generation.

In case of erroneous input, it return the error message.

run()

This method may be used for rerunning the ES (with the same parameters). Useful for refining (see subroutine test3 in test.pl). It returns the same as start().

In case of erroneous input, it return the error message.

return_best_genes()

Returns an array of the best set of variables within the last generation.

return_best_value()

Return the function value for the best set of variables of the last generation.

set_values()

At the moment, no real accessor methods for the single options are available. If one wants to change the options of an ES objects after its creation, one must reset the value directly:

    $es->set_values( 'generations' => 100 );

However, you can only change the control variables in that way. A change of the core parameters ( number of genes, gene_deviations, max_gene_values, min_gene_values or individuals; rating_function) requires a newly initialized ES object.

validate()

Do this before actual starting the ES. The method returns the error message(s) as single string, or and empty string if all is ok.

If you use start() or run(), validate() is called first internally and if there is an error, the first two methods return the error string.

PREREQUISITES

The current module depends on Math::Random from CPAN; thus this has to be in the module search path.

AUTHOR

Anselm H. C. Horn, Computer Chemie Centrum, Friedrich-Alexander-Universitaet Erlangen-Nuernberg

Anselm.Horn@chemie.uni-erlangen.de

http://www.ccc.uni-erlangen.de/clark/horn

COPYRIGHT

For this software the 'Artistic License' applies. See file LICENSE in the Math::ES distribution for details or visit http://www.opensource.org/licenses/artistic-license.php .

Cite this work in any scientific publication as

 Anselm H. C. Horn, 'ES - Evolution Strategy Optimizer', Version x.xx
  Erlangen 2003; http://www.cpan.org/authors/id/A/AH/AHCHORN/Math/ES/

Please also consider sending me a short note, maybe with the literature reference included.

VERSION

Main version number is 0.08.

$Revision: 1.27 $

SEE ALSO

perl(1).

Further reading and references:

[1] E. Schoeneburg, F. Heinmann, S. Feddersen

Genetische Algorithmen und Evolutionsstrategieen, Addison-Wesley, Bonn 1994.

[2] Z. Michalewicz

Genetic Algorithms + Data Structures = Evolution Programs, 3rd ed., Springer, Heidelberg 1996.

NO WARRANTY

There is NO WARRANTY for this software!

See COPYRIGHT for details.

5 POD Errors

The following errors were encountered while parsing the POD:

Around line 1062:

=back doesn't take any parameters, but you said =back 4

Around line 1161:

=back doesn't take any parameters, but you said =back 4

Around line 1223:

=back doesn't take any parameters, but you said =back 4

Around line 1279:

=back doesn't take any parameters, but you said =back 4

Around line 1336:

=back doesn't take any parameters, but you said =back 4