Jarkko Hietaniemi >
Statistics-Shannon >
Statistics::Shannon

Module Version: 0.03
Statistics::Shannon - Shannon index

The object-oriented interface:

use Statistics::Shannon; # The constructor is inherited from Statistics::Frequency. my $pop = Statistics::Shannon->new(@data); my $pop = Statistics::Shannon->new(\@data); my $pop = Statistics::Shannon->new(\%data); my $pop = Statistics::Shannon->new($another); # The Shannon index and the Shannon evenness. # The default base uses natural logarithm. print $pop->index, "\n"; print $pop->index($base), "\n"; print $pop->evenness, "\n"; print $pop->evenness($base), "\n";

The "anonymous" interface where the population data is not a Statistics::Frequency object but instead either an array reference, in which case the array elements are the frequencies, or a hash reference, in which keys the hash values are the frequencies.

use Statistics::Shannon; print Statistics::Shannon::index([ data ]), "\n"; print Statistics::Shannon::index([ data ], $base), "\n"; print Statistics::Shannon::index({ data }), "\n"; print Statistics::Shannon::index({ data }, $base), "\n"; print Statistics::Shannon::evenness([ data ]), "\n"; print Statistics::Shannon::evenness([ data ], $base), "\n"; print Statistics::Shannon::evenness({ data }), "\n"; print Statistics::Shannon::evenness({ data }, $base), "\n";

The rest of data manipulation interface inherited from Statistics::Frequency, see Statistics::Frequency.

$pop->add_data(@more_data); $pop->add_data(\@more_data); $pop->add_data(\%more_data); $pop->add_data($another); $pop->remove_data(@less_data); $pop->remove_data(\@less_data); $pop->remove_data(\%less_data); $pop->remove_data($another); $pop->copy_data($another); $pop->clear_data();

The Statistics::Shannon module can be used to compute the Shannon index of data, which is a variability measure of data.

The index() and evenness() interfaces are the only genuine interfaces of this module, the constructor and the rest of the data manipulation interface is inherited from Statistics::Frequency.

The Shannon index is also known as Shannon-Wiener index and as Shannon-Weaver index, especially when applied to biology and ecology and when talking about populations and biodiversity.

my $pop = Statistics::Shannon->new(@data); my $pop = Statistics::Shannon->new(\@data); my $pop = Statistics::Shannon->new(\%data); my $pop = Statistics::Shannon->new($another);

Creates a new Shannon object from the initial data.

The data may be either a list, a reference to an array or a reference to a hash.

- If the data is a list (or an array), the list elements are counted to find out their frequencies.
- If the data is a reference to an array, the array elements are counted to find out their frequencies.
- If the data is a reference to a hash, the hash keys are the data elements and the hash values are the data frequencies.
- If the data is another Statistics::Shannon object, its frequencies are used.

$pop->index; $pop->index($base);

Return the Shannon index of the data. The index is defined as

$Shannon = -sum($p{$e}*log($p{$e})

where the $p{$e} is the proportional [0,1] frequency of the element $e. The log() is the natural logarithm: if you want to use some other base, specify the base.

Evenness measures how similar the frequencies are.

$Evenness = $Shannon / log($NumberOfDifferentElements)

When all the frequencies are equal, evenness is one. Frequency imbalance increases the evenness value.

$pop->add_data(@more_data); $pop->add_data(\@more_data); $pop->add_data(\%more_data); $pop->add_data($another);

Add more data to the object. The arguments are as in new().

$pop->remove_data(@less_data); $pop->remove_data(\@less_data); $pop->remove_data(\%less_data); $pop->remove_data($another);

Remove data from the object. The arguments are as in new(). The frequencies of data elements are gapped at zero.

$pop->clear_data($another);

Copy all data from another object. The old data is discarded.

$pop->clear_data();

Remove all data from the object.

The optional base given to index() and evenness() must naturally be greater than one. If not, an error like

index: base cannot be <= 1.0

will be thrown.

Claude Elwood Shannon is known as the father of information theory: http://www-gap.dcs.st-and.ac.uk/~history/Mathematicians/Shannon.html and http://www.bell-labs.com/news/2001/february/26/1.html

For another variability index see

For the data manipulation interface see (though the whole interface is documented here)

Jarkko Hietaniemi <jhi@iki.fi> Copyright 2002

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

syntax highlighting: