mm-xml2aw-xml.pl - This program converts MetaMap xml (mm-xml) formatted text into the all words xml (aw-xml) format.
This program converts MetaMap xml (mm-xml) formatted text into the all words xml (aw-xml) format.
perl mm-xml2aw-xml.pl SOURCE DESTINATION
Directory to contain temporary and log files. DEFAULT: log
Displays the quick summary of program options.
Displays the version information.
All words xml format similar to the SemEval all words disambiguation task. In this format, each term assigned one or more concepts in the metamap xml file are outputed as follows:
<?xml version="1.0"?> <!DOCTYPE corpus SYSTEM "all-words.dtd"> <corpus lang="en"> <text id="001"> <head id="d001.s001.t001" candidates="C1280500,C2348382">effect</head> of the <head id="d001.s001.t004" candidates="C0449238">duration</head> </text> </corpus>
There exists an addition to the regular SemEval format. The candidate tags contain each possible sense of the term assigned by metamap. These will be used as the possible senses in the umls-allwords-senserelate.pl program when using the --candidate option. Otherwise, the senses come from doing a dictionary lookup in the MRCONSO table of the UMLS.
Bridget T. McInnes, University of Minnesota, Twin Cities
Copyright (c) 2007-2008, Bridget T. McInnes, University of Minnesota, Twin Cities bthomson at cs.umn.edu Ted Pedersen, University of Minnesota Duluth tpederse at d.umn.edu
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program; if not, write to
The Free Software Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.