Name

Text::SenseClusters::LabelEvaluation::ConfusionMatrixTotalCalc - Module responsible for processing of decision matrix.

DESCRIPTION

This module provide two functions. First function will calculate the probability decision matrix from the scores of the original decision matrix. The second function will then use the new decision matrix to decide whether labels are appropriately assigned or not.

function: printCalculatedScoreMatrix

        The following function is responsible for printing the calculated score 
        matrix from the decision matrix.

        @argument1      :  outputFileHandle:    DataType(File Handler)
                                        This the file handler used for defining where to print
                                        the output message/statements of this module.
                                        Its default value is: STDERR.
                                         
        @argument2      : clusterNameArrayRef:          DataType(Reference_Of_Array)
                                        Reference to Array containing Cluster Name.
                                        
        @argument3      : standardTermsArrayRef:        DataType(Reference_Of_Array)  
                                        Reference to Array containing Standard terms.
                                         
        @argument4      : hashForClusterTopicScoreRef:  DataType(Reference_Of_Hash)
                                        Reference to hash containing Cluster Name, corresponding 
                                        StandardTopic and its score.
                                        
        @argument5      : topicTotalSumHashRef:  DataType(Reference_Of_Hash)
                                        Hash which will contains the total score for a topic 
                                        against each clusters.
                                        
        @argument6      : clusterTotalSumHashRef:  DataType(Reference_Of_Hash)
                                        Hash which will contains the total score for a cluster 
                                        against each topics.

        @argument7      : $isDecisionMatrixDebugOn:  DataType(number 0 or 1)
                                  Verbose:: This decide whether to detail output or not.        


        @return         : SimilarityScore
                                  This indicate the similarity score of labels and actual
                                  topics which are correctly identified by SenseClusters 
                                  or similar application.               

        @description    :

        This module is responsible of decision matrix which is identified as:                           

        Calculated Decision MATRIX:
        
                =========================================================
                                                        |       Cluster0                |               Cluster1                |
                ---------------------------------------------------------
                        Bill Clinton:   |               0.478           |               0.522                   |
                ---------------------------------------------------------
                ---------------------------------------------------------
                        Tony Blair:     |               0.625           |               0.375                   |
                ---------------------------------------------------------
                =========================================================


         Where, 1) Cluster0, Cluster1 are  Cluster Names, (Column Header).
                         2) Bill Clinton, Tony Blair are  Standard Topics, (Row Header).
                         3) Cell content is the probability measure which indicates 
                            likelihood of a cluster's label against a Topic.
                            
        
         Steps:
                        1. First, it will iterate through hash, '%hashForClusterTopicScore'.
                        2. It will divide the cluster-topic overlapping score with the total 
                           count value of the decision matrix. 
                        3. This will give the normalized score.
                        4. Based on user input on Verbose, it will display the normalized 
                           decision matrix.
                        5. It will then call the function 'concludingFromDecisionMatrix' 
                           which will used the normalized decision matrix to conclude 
                                        a) which cluster's labels is matching with which Gold-Standard
                                           -topic's data.
                                        a) which Gold-Standard-topic's data label is matching with 
                                           which cluster's labels.
                        6. Finally, it will compare the Clusterwise results with Topicwise 
                           results to conclude final cluster-topic match results along with
                           their matching score.

function: concludingFromDecisionMatrix

        The following matrix is responsible for printing the calculated score 
        matrix from the decision matrix.

        @argument1      : hashForClusterTopicScoreRef:  DataType(Reference_Of_Hash)
                                        Reference to hash containing Cluster Name, corresponding 
                                        StandardTopic and its score.
        @argument2      : topicTotalSumHashRef:  DataType(Reference_Of_Hash)
                                        Hash which will contains the total score for a topic 
                                        against each clusters.
        @argument3      : clusterTotalSumHashRef:  DataType(Reference_Of_Hash)
                                        Hash which will contains the total score for a cluster 
                                        against each topics.
        @argument4      : directClusterTopicHashRef:  DataType(Reference_Of_Hash)
                                        HashOfHash to store conclusion of Direct calculation, 
                                        row-wise i.e a topic (OuterKey) score against each 
                                        cluster(InnerKey).
        @argument5      : directTopicClusterHashRef:  DataType(Reference_Of_Hash)
                                        HashOfHash to store conclusion of Direct calculation, 
                                        columnwise i.e a Cluster (OuterKey) scores against 
                                        each topics(InnerKey).

        
         @return1       : directClusterTopicHashRef:  DataType(Reference_Of_Hash)
                                        HashOfHash which store conclusion of calculation, 
                                        row-wise i.e a topic (OuterKey) score against each 
                                        cluster(InnerKey).
         @return2       : directTopicClusterHashRef:  DataType(Reference_Of_Hash)
                                        HashOfHash to store conclusion of calculation, 
                                        columnwise i.e a Cluster (OuterKey) scores against 
                                        each topics(InnerKey).

        @description :
        
                                        The following block of code is responsible for 
                                        1. Calculating the probabilities (normalized value) of all the   
                                                topic against a cluster. 
                                        2. Chosing a topic which has the maximum probability (normali
                                                -zed value) value for the given cluster.
                                        3. In current approach, for calculating the probability (norm
                                                -alized value) we will divide the similarity score of a  
                                                topic against a cluster with total similarity score of all 
                                                the topics against all the cluster.
        
         
                                         Future enhancement::
                                         4. The above approach can be done in two way i.e. using the  
                                                direct way as well as inverse way.
                                         5. In direct approach, for calculating the probability we 
                                         will divide    the similarity score of a topic against a 
                                         cluster with total similarity score of all the topics 
                                         against that cluster.
                                 6. In inverse approach, for calculating the probability we 
                                         will divide the similarity score of a topic against a 
                                         cluster with total similarity score of all the clusters 
                                         against that topic.

AUTHORS

        Ted Pedersen, University of Minnesota, Duluth
        tpederse at d.umn.edu

        Anand Jha, University of Minnesota, Duluth
        jhaxx030 at d.umn.edu

COPYRIGHT AND LICENSE

See http://dev.perl.org/licenses/ for more information.

This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program; if not, write to:

        The Free Software Foundation, Inc., 59 Temple Place, Suite 330, 
        Boston, MA  02111-1307  USA

To install Text::SenseClusters::LabelEvaluation::LabelEvaluation, copy and paste the appropriate command in to your terminal.

cpanm

cpanm Text::SenseClusters::LabelEvaluation::LabelEvaluation

CPAN shell

perl -MCPAN -e shell
install Text::SenseClusters::LabelEvaluation::LabelEvaluation

For more information on module installation, please visit the detailed CPAN module installation guide.

	Global
`s`	Focus search bar
`?`	Bring up this help dialog

	GitHub
`g` `p`	Go to pull requests
`g` `i`	go to github issues (only if github is preferred repository)

	POD
`g` `a`	Go to author
`g` `c`	Go to changes
`g` `i`	Go to issues
`g` `d`	Go to dist
`g` `r`	Go to repository/SCM
`g` `s`	Go to source
`g` `b`	Go to file browse

	Search terms
module: (e.g. module:Plugin)
distribution: (e.g. distribution:Dancer auth)
author: (e.g. author:SONGMU Redis)
version: (e.g. version:1.00)