Mizuki Fujisawa > Algorithm-FuzzyCmeans-0.02 > Algorithm::FuzzyCmeans

Download:
Algorithm-FuzzyCmeans-0.02.tar.gz

Dependencies

Annotate this POD

CPAN RT

New  1
Open  0
View/Report Bugs
Module Version: 0.02   Source  

NAME ^

Algorithm::FuzzyCmeans - perl implementation of Fuzzy c-means clustering

SYNOPSIS ^

  use Algorithm::FuzzyCmeans;
  
  # input documents
  my %documents = (
      Alex => { 'Pop'     => 10, 'R&B'    => 6, 'Rock'   => 4 },
      Bob  => { 'Jazz'    => 8,  'Reggae' => 9                },
      Dave => { 'Classic' => 4,  'World'  => 4                },
      Ted  => { 'Jazz'    => 9,  'Metal'  => 2, 'Reggae' => 6 },
      Fred => { 'Hip-hop' => 3,  'Rock'   => 3, 'Pop'    => 3 },
      Sam  => { 'Classic' => 8,  'Rock'   => 1                },
  );
  
  my $fcm = Algorithm::FuzzyCmeans->new(
      distance_class => 'Algorithm::FuzzyCmeans::Distance::Cosine',
      m              => 2.0,
  );
  foreach my $id (keys %documents) {
      $fcm->add_document($id, $documents{$id});
  }
  
  my $num_cluster = 3;
  my $num_iter    = 20;
  $fcm->do_clustering($num_cluster, $num_iter);             
  
  # show clustering result
  foreach my $id (sort { $a cmp $b } keys %{ $fcm->memberships }) {
      printf "%s\t%s\n", $id,
          join "\t", map { sprintf "%.4f", $_ } @{ $fcm->memberships->{$id} };
  }
  # show cluster centroids
  foreach my $centroid (@{ $fcm->centroids }) {
      print join "\t", map { sprintf "%s:%.4f", $_, $centroid->{$_} }
          keys %{ $centroid };
      print "\n";
  }

DESCRIPTION ^

Algorithm::FuzzyCmeans is a perl implementation of Fuzzy c-means clustering.

METHODS ^

new

Create a new instance.

`m' option is a fuzzyness coefficient, and must be more than 1.0 (default: 2.0).

`distance_class' option is a class name with distance function between vectors. Currently, 'Algorithm::FuzzyCmeans::Distance::Euclid'(euclid distance) and 'Algorithm::FuzzyCmeans::Distance::Cosine'(cosine distance) are supported (default: cosine).

add_document($id, $vector)

Add an input document to the instance of Algorithm::FuzzyCmeans. $id parameter is the identifier of a document, and $vector parameter is the feature vector of a document. $vector parameter must be a hash reference, each key of $vector parameter is the identifier of the feature of documents and each value of $vector is the degree of the feature.

do_clustering($num_cluster, $num_iter)

Do clustering input documents. $num_cluster parameter specifies the number of output clusters, and $num_iter parameter specifies the number of clustering iterations.

memberships

This method is the accessor of clustering result. The output of the method is a hash reference, the key is the identifier of each input document, and the value is the list of the degrees of membership of each input document in output clusters.

centroids

This method is the accessor of the vectors of cluster centroids.

AUTHOR ^

Mizuki Fujisawa <fujisawa@bayon.cc>

SEE ALSO ^

Wikipedia: Fuzzy c-means clustering http://en.wikipedia.org/wiki/Cluster_Analysis#Fuzzy_c-means_clustering

LICENSE ^

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

syntax highlighting: