The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.
NAME
    UMLS::Similarity CHANGES

  Changes from version 0.69 to 0.71
    1. modified cdist and nam to use findShortestPathLength

  Changes from version 0.67 to 0.69
    1. added a --precision option to spearmans.pl and set the default to
    four.

    2. updated jcn to return a -1 if thre is no lcs or the IC of the lcs is
    zero

    3. modified the path based measures to use findShortestPathLength rather
    than findShortestPath

  Changes from version 0.65 to 0.67
    1. umls-similarity outputs the preferred term of a CUI rather than
    randomy picking one of the associated terms

    2. the path measures return 1 if the CUIs are the same prior to going
    out and finding the shortest path

    3. modified umls-similarity.pl to allow the --config option to be used
    with the '--measure random' option.

    4. fixed the dictfile errors for lesk and vector

    5. modified the ic measures to return a -1, if the IC of the CUIs (or
    their LCS) is 0. The idea is that there is not enough information to
    determine their similarity.

    6. modified vector and lesk to return a -1, if the definition or vector
    is empty. Again, the idea is that there isnot enought information to
    determine their similarity.

  Changes from version 0.63 to 0.65
    1. modified the documentation for the lesk and vector measure

    2. added the ST option to RELDEF

    3. fixed small warnings/errors in the modules

    4. fixed the synopsis code so they should run properly

    6. Added the total number of concepts to the icfrequency and
    icpropagation files

  Changes from version 0.61 to 0.63
    1. Modified the --dictfile option for the lesk and vector measures.

    This is a dictionary file for the vector measure. It contains the
    'definitions' of a concept or term which would be used rather than the
    definitions from the UMLS. If you would like to use dictfile as a
    augmentation of the UMLS definitions, then use the --config option in
    conjunction with the --dictfile option.

    The expect format for the --dictfile file is:

    CUI: <definition> CUI: <definition> TERM: <definition> TERM:
    <definition>

    Keep in mind, when using this file with the --config option, if one of
    the CUIs or terms that you are obtaining the similarity for does not
    exist in the file the vector or lesk will return -1.

  Changes from version 0.59 to 0.61
    1. modified create-icfrequency to run faster - the new change really
    slowed things down

  Changes from version 0.57 to 0.59
    1. modified the create-icfrequency to only tag CUIs to the terms/words
    in the data that exist in the set of sources/ relations in the
    configuration file

    2. updated documentation - specifically the configuration options and
    propagation

  Changes from version 0.55 to 0.57
    1. added propagation files to the t/options directory

    2. revised errorchecking w.r.t. propagation

  Changes from version 0.53 to 0.55
    There was an errorhandler mistake when no config option was defined.
    This has been fixed.

  Changes from version 0.51 to 0.53
    1. add configuration checking for the modules

    2. check the number of tests planned for the long tests. It looks like
    they didn't get updated

    3. increment the modules by 2

    4. fix long tests

        plan skip_all => "Lengthy Tests Disabled - set UMLS_SIMILARITY_RUN_ALL
        to run this test suite\n"
        unless defined $ENV{UMLS_SIMILARITY_RUN_ALL} and
        $ENV{UMLS_SIMILARITY_RUN_ALL}==1;

        rather than what I have.

    5. add --smooth option

    6. document how the smoothing is done in the perldoc

    7. create-icfrequency.pl [PLAINTEXT|DIRECTORY] ICFREQUENCY_FILE Options:
    --term --metamap

    8. create-icpropagation.pl ICFREQUENCY_FILE ICPROPAGATION_FILE Options:
    --smooth --config CONFIGURATION_FILE

    9. Note: add time stamp to count.pl output for default output file name
    in create-propagation-count.pl

    10. released the lesk measure!!!

    11. added the --stem option in the vector measure - this option is also
    available in lesk

    12. add regular expression stoplist support to vector - this option is
    also available in lesk

  Changes from version 0.49 to 0.51
     1. modified the create-propagation-file.pl to be a bit
        more robust in order for it to accepting a file with 
        spaces when using the --icfrequency option.

     2. added remove of the output file in create-propgation-file.t
        prior to actually running the test

     3. added a check in jcn that returns 0 if the distance is equal
        to 0 <- not certain why I didn't have that in there.

     4. add --precision option to create-propagation-file.pl

     5. add check to make certain that that the .t programs can only
        be called in the main directory

     6. add a check so that the make test long environment variable
        has to equal 1 - it won't run if it equals 0

     7. remove the output directory prior to running in order be certain 
        they are removed

     8. add --precision option to create-propagation-file.pl

     9. change mkpath and rmtree to make_path() and remove_tree() in the 
        .t files

  Changes from version 0.47 to 0.49
    Note: the make test longs are not working properly on all of the
    systems.

     1. added documentation on how to create propagation file 
        for the the ic measure's 

     2. add a more complete configuration file in the samples directory

     3. add --stoplist in vector method

     4. modified the umls-similarity package for consistency with
        the new UMLS-Interface. We wanted to remove some of the 
        redundancy in the package which meant modifying this 
        package. The package is now not back compatible with 
        older versions of the UMLS-Interface package.

  Changes from version 0.45 to 0.47
    1. updated documentation

    2. added to the --dictfile option the ability to have TERMS not just
    CUIs as input.

  Changes from version 0.43 to 0.45
    1. Fixed some small errors that went out due to lesk

    2. Updated pod documentation

    3. Added README to samples/

    4. Renamed --matrixfile to --vectormatrix Renamed --indexfile to
    --vectorindex Renamed --propagationfile to --icpropagation

    5. Added the --defraw option for lesk and vector. This will stop any
    cleaning that is done to the definition prior to use.

    6. Created a create-propagation-file.pl program to create a file
    containing the information content of the CUIs in a specified set of
    sources to be used by umls-similarity when using the information content
    semantic similarity measures.

    7. Created testing file vector-input.t add bigrams input file under
    t/tests/utils add index and matrix files under t/key/static

    8. Added the --defraw option to vector.pm. If this option is set it will
    NOT clean the definition otherwise the words in the definition are
    cleaned up (ie remove punctuation, lower case, ...)

    9. Created testing file create-propagation-file.t

  Changes from version 0.41 to 0.43
    Modified documentation and cleaned up the package

    Updated the vector.pm that the vector method read the index file and
    co-occurrence matrix from the command line.

    Added the --dictfile options to read the definitions from a text file.
    Each line is a definition. def1: the first definition def2: the second
    definition

    Added the --debugfile options. It can print the definition of each
    concept and the vector (words) of every word covered by the
    co-occurrence matrix.

    Added the vector-input.pl. This file generates the index file and
    co-occurrence matrix file from the bigrams list for vector measure use.

    Added test cases for each of the measures in the t/ directory

    Added a samples/ directory which contains examples of the configuration
    file, matrix and index files for the vector measure, and propagation
    file for the information content measures.

  Changes from version 0.39 to 0.41
        Modified documentation and cleaned up the package

  Changes from version 0.37 to 0.39
    Updated the way that the lcs and shortestpath was being done in
    UMLS-Interface. There were some complications.

    Added the vector.pm module and the def.pm module

    Added the --measure vector option to umls-similarity.

  Changes from version 0.35 to 0.37
    I messed up when modifying the way the lcs and shortestpath information
    was obtained in UMLS-Similarity after I had changed it in
    UMLS-Interface. It should all be fixed now.

    I also removed the vector and lesk measures. We are not quite ready for
    those and are in the process of getting them together for a fresh
    release.

    I changed the Jiang and Conrath measure from jnc to jcn like it is
    suppose to be

  Changes from version 0.33 to 0.35
    1. Modified the way lcs and shortestpath information is returned by the
    UMLS-Interface. Needed to modify the UMLS-Similarity to reflect these
    changes

  Changes from version 0.31 to 0.33
    1. Modified the propogation in UMLS-Interface and needed to reflect this
    in UMLS-Similarity

  Changes from version 0.29 to 0.31
    1. Added the information content measures proposed by Resnik (1995),
    Jiang and Conrath (1997) and Lin (1998), as well as a random measure
    that returns a random number between one and zero as the similarity
    score.

    2. Added a --debug option which prints out UMLS-Interface informatin for
    debugging purposes

  Changes from version 0.21 to 0.29
    I used the following require command:

        require "UMLS::Similarity::vector"

    rather than

        require "UMLS/Similarity/vector.pm";

    Not certain what I was thinking ...

  Changes from version 0.19 to 0.21
    I fixed (for certain this time) how the modules are installed in the
    umls-similarity.pl program in the utils/ directory. It should not
    require BerkeleyDB now unless you are running the
    UMLS::Similarity::vector measure.

    For documentation puposes:

            'use' loads the package at compile time

    where as

            'require' loads the package at run time

    Here is some documentation on it:

    <http://perldoc.perl.org/perlfaq8.html#What's-the-difference-between-req
    uire-and-use?>

    So when using 'use UMLS::Similarity::vector' - the program was loading
    vector.pm at compile time which used dbif.pm which requires BerkeleyDB
    to be installed. I switched to 'require "UMLS::Similarity::vector"'
    which now only loads vector.pm at runtime and only when use specify the
    vector measure.

  Changes from version 0.17 to 0.19
    The --verbose option originally in the package is changed to a --info
    option. This option prints out a little more information about a concept
    if it doesn't exist in the sources that are being used.

    The reason for this change is because a new --verbose option was added
    to UMLS-Interface and we wanted to keep the options consistent. Since I
    do not think too many people were using the old --verbose option, I
    don't *think* it will cause too many problems.

    The new --verbose option will print out path information to a file
    rather than having this be done automatically. This will reduce the
    amount of storage space required to hold the path information for a
    given set of sources and relations.

    A new --forcerun option was also added. This option will just create the
    what we call the index - the path information - required by the program
    without prompting you to continue. So this will assume you know what you
    are doing and will not question you :)

    I also fixed (I hope) how the modules are installed in the
    umls-similarity.pl program in the utils/ directory. It should not
    require BerkeleyDB now unless you are running the
    UMLS::Similarity::vector measure.

  Changes from version 0.15 to 0.17
    Modified the output of umls-similarity to return only the pair of CUIs
    that obtain the highest similarity score when terms that map to multiple
    concepts are being used. I added a --allsenses option if you would like
    the original output that displayed all possible CUIs with their
    similarity score for a given pair of terms.

    Added the vector measure. This measure is in 'beta' version so there
    will be some modifications to it in the future.

    I also removed the print statements that were displayed when the Wu and
    Palmer (wup) measure was being used. Sorry about the noise.

  Changes from version 0.13 to 0.15
    Modified the output of umls-similarity so that if a concept doesn't
    exist you can tell which one it is. I also added a --verbose option
    which gives a little more information about the concept that doesn't
    exist.

    Modified the wup.pm module. The Wu and Palmer measure is twice the depth
    of the two concepts LCS divided by the product of the the depths of the
    individual concepts. I was using the minimum depths of these concepts
    where as I should have been using the depth of the path that contained
    the LCS itself.

  Changes from version 0.11 to 0.13
    Added two new semantic similarity measure modules: i) nam.pm which is an
    implemantion of the semantic similarity measure described by Nguyan and
    Al-Mubaid, 2006 and ii) cdist.pm which is the Conceptual Distance
    measure described by Rada, et. al., 1989.

    Modified the umls-similarity.pl utility program to incorporate the
    nam.pm and cdist.pm similarity modules.

  Changes from version 0.09 to 0.11
    Allow the umls-similarity.pl program to obtain the semantic similarity
    between a term and CUI as well as CUI-CUI and term-term pairs.

    Added some error checking to the umls-similarity.pl program and the
    measure modules.

  Changes from version 0.07 to 0.09
    Removed the queryUMLS.pl module and put in its place umls-similarity.pl.
    This does exactly what the original queryUMLS.pl does but it also now
    accepts files and is much nicer.

  Changes from version 0.05 to 0.07
    Added the similarity measure described by Wu and Palmer (1994)

    Updated the documentation

  Changes from version 0.03 to 0.05
    Modified the Changelog directory

    Modified documenation - tried to get the misspelling and obvious errors
    removed.

    Removed the HTML documentation

  Changes from version 0.01 to 0.03
    Modified documentation

    Modified the utils/ program