getIC.pl - This program returns the information content of a concept or a term.
This program takes in a CUI or a term and returns its information content.
Usage: getIC.pl [OPTION] [CUI|TERM]
Concept Unique Identifier (CUI) or a term from the Unified Medical Language System (UMLS)
Uses intrinic information content of the CUIs defined by Sanchez and Betet 2011 or Seco, et al 2004.
Calculate information content using the frequency information in FILE. The file must be in the following format:
See the example files called icfrequency in the samples/ directory.
Calculate information content using the probability information in FILE. The file must be in the following format:
See the example files called icpropagation in the samples/ directory.
This is the configuration file. The format of the configuration file is as follows:
SAB :: <include|exclude> <source1, source2, ... sourceN>
REL :: <include|exclude> <relation1, relation2, ... relationN>
RELA :: <include|exclude> <rela1, rela2, ... relaN> (optional)
For example, if we wanted to use the MSH vocabulary with only the RB/RN relations, the configuration file would be:
SAB :: include MSH REL :: include RB, RN RELA :: include inverse_isa, isa
SAB :: include MSH REL :: exclude PAR, CHD
If you go to the configuration file directory, there will be example configuration files for the different runs that you have performed.
Incorporate Laplace smoothing, where the frequency count of each of the concepts in the taxonomy is incremented by one. The advantage of doing this is that it avoides having a concept that has a probability of zero. The disadvantage is that it can shift the overall probability mass of the concepts from what is actually seen in the corpus.
This option will not create a database of the information content for all of concepts in the specified set of sources and relations in the config file
Takes a file of CUIs (one per line) and returns their information content.
Sets the debug flag for testing
Username is required to access the umls database on MySql unless it was specified in the my.cnf file at installation
Password is required to access the umls database on MySql unless it was specified in the my.cnf file at installation
Hostname where mysql is located. DEFAULT: localhost
The socket your mysql is using. DEFAULT: /tmp/mysql.sock
Database contain UMLS DEFAULT: umls
Displays the quick summary of program options.
Displays the version information.
List of CUIs that are associated with the input term
Bridget T. McInnes, University of Minnesota
Copyright (c) 2007-2009,
Bridget T. McInnes, University of Minnesota bthomson at cs.umn.edu Ted Pedersen, University of Minnesota Duluth tpederse at d.umn.edu Siddharth Patwardhan, University of Utah, Salt Lake City email@example.com Serguei Pakhomov, University of Minnesota Twin Cities firstname.lastname@example.org
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program; if not, write to:
The Free Software Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.