The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

Name

Statistics::ChisqIndep - The module to perform chi-square test of independence (a.k.a. contingency tables)

Synopsis

 #example for Statistics::ChisqIndep
 use strict;
 use Statistics::ChisqIndep;
 use POSIX;
 # input data in the form of the array of array references
 my @obs = ([15, 68, 83], [23,47,65]);
 my $chi = new Statistics::ChisqIndep;
 $chi->load_data(\@obs);
 # print the summary data along with the contingency table
 $chi->print_summary();  
 #print the contingency table only
 $chi->print_contingency_table(); 
 #the following output is the same as calling the function of print_summary
 #all of the detailed info such as the expected values, degree of freedoms 
 #and totals are accessible as object globals  
 #check if the load_data() call is successful
 if($chi->{valid}) {  
   print "Rows: ", $chi->{rows}, "\n"; 
   print "Columns: ", $chi->{cols}, "\n"; 
   print "Degree of Freedom: ", $chi->{df}, "\n";
   print "Total Count: ", $chi->{total}, "\n";
   print "Chi-square Statistic: ", 
         $chi->{chisq_statistic}, "\n";
   print "p-value: ", $chi->{p_value}, "\n";
   print "Warning: 
         some of the cell counts might be too low.\n" 
     if ($chi->{warning});
   #output the contingency table
   my $rows = $chi->{rows};  # # rows
   my $cols = $chi->{cols};  # # columns
   my $obs = $chi->{obs}; # observed values 
   my $exp = $chi->{expected}; # expected values 
   my $rtotals = $chi->{rtotals}; # row totals 
   my $ctotals = $chi->{ctotals}; #column totals 
   my $total = $chi->{total}; # total counts
   for (my $j = 0; $j < $cols; $j++) {
     print "\t",$j + 1;
   }
   print "\trtotal\n"; 
   for (my $i = 0; $i < $rows; $i ++) {
     print $i + 1, "\t"; 
     for(my $j = 0 ; $j < $cols; $j ++) {
      #observed values can be accessed
      #in the following way 
      print $obs->[$i]->[$j], "\t";  
     }
     #row totals can be accessed
     # in the following way
     print $rtotals->[$i], "\n";
     print "\t";
     for(my $j = 0 ; $j < $cols; $j ++) {
      #expected values can be accessed 
      #in the following way
      printf "(%.2f)\t", $exp->[$i]->[$j];
     }
     print "\n"; 
   }
   print "ctotal\t";
   for (my $j = 0; $j < $cols; $j++) {
     #column totals can be accessed in the following way
     print $ctotals->[$j], "\t";
   }
   #output total counts
   print $total, "\n";
 }

Description

 This is the module to perform the Pearson's Chi-squared test on contingency tables of 2 dimensions. The users input the observed values in the table form and the module will compute the expected values for each cell based on the independence hypothesis. The module will then compute the chi-square statistic and the corresponding p-value based on the observed and the expected values to test if the 2 dimensions are truly independent.  

AUTHOR

 Yun-Fang Juan , Yahoo! Inc. 
 yunfang@yahoo-inc.com 
 yunfangjuan@yahoo.com

SEE ALSO

Statistics::Distributions