The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.
NAME
    UMLS::Similarity CHANGES

  Changes from version 1.23 to 1.25
    1. updated documentation

    2. modified umls-similarity.pl to return an error if the icfrequency
    information does not match the configuration file -- similar to what we
    do with the icpropagation option

  Changes from version 1.21 to 1.23
    1. modified query-umls-similarity-webinterface.pl to return -1 when one
    or both input terms/cuis is unknown.

  Changes from version 1.19 to 1.21
    1. added a --loadcache and --getcache option to speed things up if
    desired

  Changes from version 1.17 to 1.19
    1. added a --word option to sim2r.pl in order to use it with files that
    do not contain CUIs as the term pairs.

  Changes from version 1.15 to 1.17
    1. added the measure proposed by Zhong, et al 2002 to the
    UMLS::Similarity package. You can use it by using the --measure zhong
    option in the umls-similarity.pl program. I will add this to the web
    interface shortly. If you are in a hurry though, let me know :-)

  Changes from version 1.13 to 1.15
    1. added the --original option which will return the original distance
    score between the two cuis rather than the similarity score (this is for
    nam, cdist and jcn)

    2. fixed cdist and nam to return a -1 for lack of sufficient path
    information

  Changes from version 1.11 to 1.13
    1. modified the query-umls-similarity-webinterface.pl to retry the query
    if an error is thrown due to a busy server rather than just return
    ()<>()

  Changes from version 1.09 to 1.11
    1. modified the synopsis code in the measures

    2. modified the web documentation

    3. included the icpropagation and vectorfiles in the distribution for
    the web interface

    4. added a --url option to the query-umls-similarity-webinterface.pl
    program.

  Changes from version 1.07 to 1.09
    1. added the --st option to create-icfrequency and create-icpropagation
    to propagation semantic types to find their probability using the is-a
    heirarchy in the semantic network

    2. added the --st option to umls-similarity.pl for the Resnik (res)
    measure - this will return the information content of the semantic type
    of the two concepts least common subsumer

  Changes from version 1.05 to 1.07
    1. added the query-umls-similarity-webinterface.pl program to
    automatically query the web interface

    2. added default vectormatrix and vectorindex files if they are not
    specified on the command line

    3. added default icpropagation information to the icpropagation
    similarity measures if the --icpropagation or --icfrequency options are
    not specified on the command line

    4. updated the web interface to include OMIM

  Changes from version 1.03 to 1.05
    1. modified the .pm modules and umls-similarity.pl program to account
    for the changes in the UMLS::Interface API which now passes all array
    information back by reference.

  Changes from version 1.01 to 1.03
    1. Fixed umls-similarity.pl - it was printing out all possible senses of
    a term pair if one of the terms didn't exists. Now it just prints out
    one unless the --allsenses option is used.

  Changes from version 0.99 to 1.01
    1. Added a --word option to spearmans.pl to use for input files that do
    not consist of CUIs and/or are not in the umls-similarity.pl output
    format

  Changes from version 0.97 to 0.99
    1. modified the umls_similarity_server.pl program to account for spaces
    in the preferred term when a CUI is entered

    2. modified the umls_similarity_server.pl program to include showing the
    shortest path between the concepts and their definitions if they exist
    in the view of the UMLS that is specified at the SAB.

  Changes from version 0.95 to 0.97
    1. updated the --debugfile option print out the concept's definition
    after --dbouledef, --compoundfile, --stoplist, --defraw, --stem options.

    2. fix the bug when use --dictfile and --config together, if a word is
    not in UMLS, it will check the dictfile.

  Changes from version 0.93 to 0.95
    1. modified the check to see if a concept exists when using the vector
    or lesk measure - it was checking the existence based on the SAB/REL
    parameters rather than the SABDEF/RELDEF

  Changes from version 0.91 to 0.93
    1. updated the --metamap option in create-icfrequency.pl to require the
    year (version) of metamap that is being used in order to call it on the
    command line.

    2. modified the umls-similarity.pl program to obtain the CUIs of terms
    from a different function when using the lesk or vector measures (this
    is due to the config options).

  Changes from version 0.89 to 0.91
    1. modified documentation

    2. fixed an error in the output of the umls-similarity.pl program

    3. stipulated the --realtime option can only be used with the similarity
    measures and modified the tests accordingly

    4. turned back on the error checking - not certain why I had it off

    5. added error.t test to check the ErrorHandler.pm module

  Changes from version 0.85 to 0.89
    1. Added the web interface scripts

  Changes from version 0.85 to 0.87
    1. Modified the documentation in jcn.pm, vector.pm and lesk.pm

    2. update vector-input.pl

  Changes from version 0.83 to 0.85
    1. Add --doubledef document in vector.pm and lesk.pm

  Changes from version 0.81 to 0.83
    1. Add --doubledef in vector.pm and lesk.pm

  Changes from version 0.79 to 0.81
    1. added the --compoundfile option for vector

    2. added the --dictfile option for vector

  Changes from version 0.77 to 0.79
    1. I am testing a new measure and forgot to it from umls-similarity

  Changes from version 0.75 to 0.77
    1. there was an error in the umls-similarity.pl program. I was messing
    around with a new measure and forgot to remove its reference

    2. updated documentation

  Changes from version 0.73 to 0.75
    1. modified umls-similarity with the --compoundfile option for lesk and
    vector

    2. modified lesk and vector to accept compound words

    3. fixed a small output error in umls-similarity when no similarity is
    found between two concepts

    4. added the --compoundify option to icfrequency

  Changes from version 0.71 to 0.73
    1. added the undirected option to umls-similarity for the path measure

  Changes from version 0.69 to 0.71
    1. modified cdist and nam to use findShortestPathLength

    2. added developers-check to t/ directory

  Changes from version 0.67 to 0.69
    1. added a --precision option to spearmans.pl and set the default to
    four.

    2. updated jcn to return a -1 if thre is no lcs or the IC of the lcs is
    zero

    3. modified the path based measures to use findShortestPathLength rather
    than findShortestPath

  Changes from version 0.65 to 0.67
    1. umls-similarity outputs the preferred term of a CUI rather than
    randomy picking one of the associated terms

    2. the path measures return 1 if the CUIs are the same prior to going
    out and finding the shortest path

    3. modified umls-similarity.pl to allow the --config option to be used
    with the '--measure random' option.

    4. fixed the dictfile errors for lesk and vector

    5. modified the ic measures to return a -1, if the IC of the CUIs (or
    their LCS) is 0. The idea is that there is not enough information to
    determine their similarity.

    6. modified vector and lesk to return a -1, if the definition or vector
    is empty. Again, the idea is that there isnot enought information to
    determine their similarity.

  Changes from version 0.63 to 0.65
    1. modified the documentation for the lesk and vector measure

    2. added the ST option to RELDEF

    3. fixed small warnings/errors in the modules

    4. fixed the synopsis code so they should run properly

    6. Added the total number of concepts to the icfrequency and
    icpropagation files

  Changes from version 0.61 to 0.63
    1. Modified the --dictfile option for the lesk and vector measures.

    This is a dictionary file for the vector measure. It contains the
    'definitions' of a concept or term which would be used rather than the
    definitions from the UMLS. If you would like to use dictfile as a
    augmentation of the UMLS definitions, then use the --config option in
    conjunction with the --dictfile option.

    The expect format for the --dictfile file is:

    CUI: <definition> CUI: <definition> TERM: <definition> TERM:
    <definition>

    Keep in mind, when using this file with the --config option, if one of
    the CUIs or terms that you are obtaining the similarity for does not
    exist in the file the vector or lesk will return -1.

  Changes from version 0.59 to 0.61
    1. modified create-icfrequency to run faster - the new change really
    slowed things down

  Changes from version 0.57 to 0.59
    1. modified the create-icfrequency to only tag CUIs to the terms/words
    in the data that exist in the set of sources/ relations in the
    configuration file

    2. updated documentation - specifically the configuration options and
    propagation

  Changes from version 0.55 to 0.57
    1. added propagation files to the t/options directory

    2. revised errorchecking w.r.t. propagation

  Changes from version 0.53 to 0.55
    There was an errorhandler mistake when no config option was defined.
    This has been fixed.

  Changes from version 0.51 to 0.53
    1. add configuration checking for the modules

    2. check the number of tests planned for the long tests. It looks like
    they didn't get updated

    3. increment the modules by 2

    4. fix long tests

        plan skip_all => "Lengthy Tests Disabled - set UMLS_SIMILARITY_RUN_ALL
        to run this test suite\n"
        unless defined $ENV{UMLS_SIMILARITY_RUN_ALL} and
        $ENV{UMLS_SIMILARITY_RUN_ALL}==1;

        rather than what I have.

    5. add --smooth option

    6. document how the smoothing is done in the perldoc

    7. create-icfrequency.pl [PLAINTEXT|DIRECTORY] ICFREQUENCY_FILE Options:
    --term --metamap

    8. create-icpropagation.pl ICFREQUENCY_FILE ICPROPAGATION_FILE Options:
    --smooth --config CONFIGURATION_FILE

    9. Note: add time stamp to count.pl output for default output file name
    in create-propagation-count.pl

    10. released the lesk measure!!!

    11. added the --stem option in the vector measure - this option is also
    available in lesk

    12. add regular expression stoplist support to vector - this option is
    also available in lesk

  Changes from version 0.49 to 0.51
     1. modified the create-propagation-file.pl to be a bit
        more robust in order for it to accepting a file with 
        spaces when using the --icfrequency option.

     2. added remove of the output file in create-propgation-file.t
        prior to actually running the test

     3. added a check in jcn that returns 0 if the distance is equal
        to 0 <- not certain why I didn't have that in there.

     4. add --precision option to create-propagation-file.pl

     5. add check to make certain that that the .t programs can only
        be called in the main directory

     6. add a check so that the make test long environment variable
        has to equal 1 - it won't run if it equals 0

     7. remove the output directory prior to running in order be certain 
        they are removed

     8. add --precision option to create-propagation-file.pl

     9. change mkpath and rmtree to make_path() and remove_tree() in the 
        .t files

  Changes from version 0.47 to 0.49
    Note: the make test longs are not working properly on all of the
    systems.

     1. added documentation on how to create propagation file 
        for the the ic measure's 

     2. add a more complete configuration file in the samples directory

     3. add --stoplist in vector method

     4. modified the umls-similarity package for consistency with
        the new UMLS-Interface. We wanted to remove some of the 
        redundancy in the package which meant modifying this 
        package. The package is now not back compatible with 
        older versions of the UMLS-Interface package.

  Changes from version 0.45 to 0.47
    1. updated documentation

    2. added to the --dictfile option the ability to have TERMS not just
    CUIs as input.

  Changes from version 0.43 to 0.45
    1. Fixed some small errors that went out due to lesk

    2. Updated pod documentation

    3. Added README to samples/

    4. Renamed --matrixfile to --vectormatrix Renamed --indexfile to
    --vectorindex Renamed --propagationfile to --icpropagation

    5. Added the --defraw option for lesk and vector. This will stop any
    cleaning that is done to the definition prior to use.

    6. Created a create-propagation-file.pl program to create a file
    containing the information content of the CUIs in a specified set of
    sources to be used by umls-similarity when using the information content
    semantic similarity measures.

    7. Created testing file vector-input.t add bigrams input file under
    t/tests/utils add index and matrix files under t/key/static

    8. Added the --defraw option to vector.pm. If this option is set it will
    NOT clean the definition otherwise the words in the definition are
    cleaned up (ie remove punctuation, lower case, ...)

    9. Created testing file create-propagation-file.t

  Changes from version 0.41 to 0.43
    Modified documentation and cleaned up the package

    Updated the vector.pm that the vector method read the index file and
    co-occurrence matrix from the command line.

    Added the --dictfile options to read the definitions from a text file.
    Each line is a definition. def1: the first definition def2: the second
    definition

    Added the --debugfile options. It can print the definition of each
    concept and the vector (words) of every word covered by the
    co-occurrence matrix.

    Added the vector-input.pl. This file generates the index file and
    co-occurrence matrix file from the bigrams list for vector measure use.

    Added test cases for each of the measures in the t/ directory

    Added a samples/ directory which contains examples of the configuration
    file, matrix and index files for the vector measure, and propagation
    file for the information content measures.

  Changes from version 0.39 to 0.41
        Modified documentation and cleaned up the package

  Changes from version 0.37 to 0.39
    Updated the way that the lcs and shortestpath was being done in
    UMLS-Interface. There were some complications.

    Added the vector.pm module and the def.pm module

    Added the --measure vector option to umls-similarity.

  Changes from version 0.35 to 0.37
    I messed up when modifying the way the lcs and shortestpath information
    was obtained in UMLS-Similarity after I had changed it in
    UMLS-Interface. It should all be fixed now.

    I also removed the vector and lesk measures. We are not quite ready for
    those and are in the process of getting them together for a fresh
    release.

    I changed the Jiang and Conrath measure from jnc to jcn like it is
    suppose to be

  Changes from version 0.33 to 0.35
    1. Modified the way lcs and shortestpath information is returned by the
    UMLS-Interface. Needed to modify the UMLS-Similarity to reflect these
    changes

  Changes from version 0.31 to 0.33
    1. Modified the propogation in UMLS-Interface and needed to reflect this
    in UMLS-Similarity

  Changes from version 0.29 to 0.31
    1. Added the information content measures proposed by Resnik (1995),
    Jiang and Conrath (1997) and Lin (1998), as well as a random measure
    that returns a random number between one and zero as the similarity
    score.

    2. Added a --debug option which prints out UMLS-Interface informatin for
    debugging purposes

  Changes from version 0.21 to 0.29
    I used the following require command:

        require "UMLS::Similarity::vector"

    rather than

        require "UMLS/Similarity/vector.pm";

    Not certain what I was thinking ...

  Changes from version 0.19 to 0.21
    I fixed (for certain this time) how the modules are installed in the
    umls-similarity.pl program in the utils/ directory. It should not
    require BerkeleyDB now unless you are running the
    UMLS::Similarity::vector measure.

    For documentation puposes:

            'use' loads the package at compile time

    where as

            'require' loads the package at run time

    Here is some documentation on it:

    <http://perldoc.perl.org/perlfaq8.html#What's-the-difference-between-req
    uire-and-use?>

    So when using 'use UMLS::Similarity::vector' - the program was loading
    vector.pm at compile time which used dbif.pm which requires BerkeleyDB
    to be installed. I switched to 'require "UMLS::Similarity::vector"'
    which now only loads vector.pm at runtime and only when use specify the
    vector measure.

  Changes from version 0.17 to 0.19
    The --verbose option originally in the package is changed to a --info
    option. This option prints out a little more information about a concept
    if it doesn't exist in the sources that are being used.

    The reason for this change is because a new --verbose option was added
    to UMLS-Interface and we wanted to keep the options consistent. Since I
    do not think too many people were using the old --verbose option, I
    don't *think* it will cause too many problems.

    The new --verbose option will print out path information to a file
    rather than having this be done automatically. This will reduce the
    amount of storage space required to hold the path information for a
    given set of sources and relations.

    A new --forcerun option was also added. This option will just create the
    what we call the index - the path information - required by the program
    without prompting you to continue. So this will assume you know what you
    are doing and will not question you :)

    I also fixed (I hope) how the modules are installed in the
    umls-similarity.pl program in the utils/ directory. It should not
    require BerkeleyDB now unless you are running the
    UMLS::Similarity::vector measure.

  Changes from version 0.15 to 0.17
    Modified the output of umls-similarity to return only the pair of CUIs
    that obtain the highest similarity score when terms that map to multiple
    concepts are being used. I added a --allsenses option if you would like
    the original output that displayed all possible CUIs with their
    similarity score for a given pair of terms.

    Added the vector measure. This measure is in 'beta' version so there
    will be some modifications to it in the future.

    I also removed the print statements that were displayed when the Wu and
    Palmer (wup) measure was being used. Sorry about the noise.

  Changes from version 0.13 to 0.15
    Modified the output of umls-similarity so that if a concept doesn't
    exist you can tell which one it is. I also added a --verbose option
    which gives a little more information about the concept that doesn't
    exist.

    Modified the wup.pm module. The Wu and Palmer measure is twice the depth
    of the two concepts LCS divided by the product of the the depths of the
    individual concepts. I was using the minimum depths of these concepts
    where as I should have been using the depth of the path that contained
    the LCS itself.

  Changes from version 0.11 to 0.13
    Added two new semantic similarity measure modules: i) nam.pm which is an
    implemantion of the semantic similarity measure described by Nguyan and
    Al-Mubaid, 2006 and ii) cdist.pm which is the Conceptual Distance
    measure described by Rada, et. al., 1989.

    Modified the umls-similarity.pl utility program to incorporate the
    nam.pm and cdist.pm similarity modules.

  Changes from version 0.09 to 0.11
    Allow the umls-similarity.pl program to obtain the semantic similarity
    between a term and CUI as well as CUI-CUI and term-term pairs.

    Added some error checking to the umls-similarity.pl program and the
    measure modules.

  Changes from version 0.07 to 0.09
    Removed the queryUMLS.pl module and put in its place umls-similarity.pl.
    This does exactly what the original queryUMLS.pl does but it also now
    accepts files and is much nicer.

  Changes from version 0.05 to 0.07
    Added the similarity measure described by Wu and Palmer (1994)

    Updated the documentation

  Changes from version 0.03 to 0.05
    Modified the Changelog directory

    Modified documenation - tried to get the misspelling and obvious errors
    removed.

    Removed the HTML documentation

  Changes from version 0.01 to 0.03
    Modified documentation

    Modified the utils/ program