View on
MetaCPAN is shutting down
For details read Perl NOC. After June 25th this page will redirect to
Bruno Vecchi > AI-Genetic-Pro-Macromolecule > AI::Genetic::Pro::Macromolecule



Annotate this POD

Module Version: 0.09280.0_001   Source  


AI::Genetic::Pro::Macromolecule - Genetic Algorithms to evolve DNA, RNA and Protein sequences


version 0.09280.0_001


    use AI::Genetic::Pro::Macromolecule;

    my @proteins = ($seq1, $seq2, $seq3, ... );

    my $m = AI::Genetic::Pro::Macromolecule->new(
        type    => 'protein',
        fitness => \&hydrophobicity,
        initial_population => \@proteins,

    sub hydrophobicity {
        my $seq = shift;
        my $score = f($seq)

        return $score;

    $m->evolve(10) # evolve for 10 generations;

    my $most_hydrophobic = $m->fittest->{seq};   # get the best sequence
    my $highest_score    = $m->fittest->{score}; # get top score

    # Want the score stats throughout generations?
    my $history = $m->history;

    my $mean_history = $history->{mean}; # [ mean1, mean2, mean3, ... ]
    my $min_history  = $history->{min};  # [ min1,  min2,  min3,  ... ]
    my $max_history  = $history->{max};  # [ max1,  max2,  max3,  ... ]


AI::Genetic::Pro::Macromolecule is a wrapper over AI::Genetic::Pro, aimed at easily evolving protein, DNA or RNA sequences using arbitrary fitness functions.

Its purpose it to allow optimization of macromolecule sequences using Genetic Algorithms, with as little set up time and burdain as possible.

Standing atop AI::Genetic::Pro, it is reasonably fast and memory efficient. It is also highly customizable, although I've chosen what I think are sensible defaults for every parameter, so that you don't have to worry about them if you don't know what they mean.



Accepts a CodeRef that should assign a numeric score to each string sequence that it's passed to it as an argument. Required.

    sub fitness {
        my $seq = shift;

        # Do something with $seq and return a score
        my $score = f($seq);

        return $score;

    my $m = AI::Genetic::Pro::Macromolecule->new(
        fitness => \&fitness,


Accepts a CodeRef. It will be applied once at the end of each generation. If returns true, evolution will stop, disregarding the generation steps passed to the evolve method.

The CodeRef should accept an AI::Genetic::Pro::Macromolecule object as argument, and should return either true or false.

    sub reached_max {
        my $m = shift;  # an AI::G::P::Macromolecule object

        my $highest_score = $m->fittest->{score};

        if ( $highest_score > 9000 ) {
            warn "It's over 9000!";
            return 1;

    my $m = AI::Genetic::Pro::Macromolecule->new(
        terminate => \&reached_max,

In the above example, evolution will stop the moment the top score in any generation exceeds the value 9000.


Decide whether the sequences can have different lengths. Accepts a Bool value. Defaults to 1.


Manually set the allowed maximum length of the sequences, accepts Int.

This attribute is required unless an initial population is provided. In that case, length will be set as equal to the length of the longest sequence provided if it's not explicity specified.


Macromolecule type: protein, dna, or rna. Required.


Sequences to add to the initial pool before evolving. Accepts an ArrayRef[Str].

    my $m = AI::Genetic::Pro::Macromolecule->new(
        initial_population => ['ACGT', 'CAAC', 'GTTT'],


Accepts a Bool value. When true, score results for each sequence will be stored, to avoid costly and unnecesary recomputations. Set to 1 by default.


Mutation rate, a Num between 0 and 1. Default is 0.05.


Crossover rate, a Num between 0 and 1. Default is 0.95.


Number of sequences per generation. Default is 300.


Number of parents sequences in recombinations. Default is 2.


Defines how sequences are selected to crossover. It expects an ArrayRef:

    selection => [ $type, @params ]

See docs in AI::Genetic::Pro for details on available selection strategies, parameters, and their meanings. Default is Roulette, in which at first the best individuals/chromosomes are selected. From this collection parents are selected with probability poportionaly to its fitness.


Defines strategy of crossover operation. It expects an ArrayRef:

    strategy => [ $strategy, @params ]

See docs in AI::Genetic::Pro for details on available crossover strategies, parameters, and their meanings. Default is [ Points, 2 ], in which parents are crossed at 2 points and the best child is moved to the next generation.


Whether to inject the best sequences for next generation, and if so, how many. Defaults to 5.




Evolve the sequence population for the specified number of generations. Accepts an optional single Int argument. If $n is 0 or undef, it will evolve undefinitely or terminate returns true.


Returns the current generation number.


Returns an Array[HashRef] with the desired number of top scoring sequences. The hash reference has two keys, 'seq' which points to the sequence string, and 'score' which points to the sequence's score.

    my @top_2 = $m->fittest(2);
    # (
    #     { seq => 'VIKP', score => 10 },
    #     { seq => 'VLKP', score => 9  },
    # )

When called with no arguments, it returns a HashRef with the top scoring sequence.

    my $fittest = $m->fittest;
    # { seq => 'VIKP', score => 10 }


Returns a HashRef with the minimum, maximum and mean score for each generation.

    my $history = $m->history;
    # {
    #     min  => [ 0, 0, 0, 1, 2, ... ],
    #     max  => [ 1, 2, 2, 3, 4, ... ],
    #     mean => [ 0.2, 0.3, 0.5, 1.5, 3, ... ],
    # }

To access the mean score for the $n-th generation, for instance:

    $m->history->{mean}->[$n - 1];


Returns a HashRef with the minimum, maximum and mean score fore the current generation.

    # { min => 2, max => 10, mean => 3.5 }


Returns an Array[HashRef] with all the sequences of the current generation and their scores, in no particular order.

    my @seqs = $m->current_population;
    # (
    #     { seq => 'VIKP', score => 10 },
    #     { seq => 'VLKP', score => 9  },
    #     ...
    # )


  Bruno Vecchi <vecchi.b>


This software is copyright (c) 2009 by Bruno Vecchi.

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.

syntax highlighting: