Ted Pedersen > Text-SenseClusters-1.05 > sval2plain.pl


Annotate this POD


Open  0
View/Report Bugs


sval2plain.pl - Convert a Senseval-2 data file into plain text format


 sval2plain.pl [OPTIONS] SVAL2

Note that there are 255 instances (contexts) in the Senseval-2 formatted input file.

 frequency.pl begin.v-test.xml


 <sense id="begin%2:30:00::" percent="64.31"/>
 <sense id="begin%2:30:01::" percent="14.51"/>
 <sense id="begin%2:42:04::" percent="21.18"/>
 Total Instances = 255
 Total Distinct Senses=3
 % of Majority Sense = 64.31

After converting to plain text, note that there are 255 lines in that file, one per context.

 sval2plain.pl begin.v-test.xml > begin.v-test.txt

 wc begin.v-test.txt


 255   15049   92598 begin.v-test.txt

You can find begin.v-test.xml in samples/Data

You can type sval2plain.pl --help for a quick summary of options


Converts a given file from Senseval-2 format into plain text format. Each line of the plain text files contains a single context. This is useful when you have Senseval-2 data that you would like to use as feature extraction (training) data, which much be in plain text format.


Required Arguments:


Input file in Senseval-2 format that is to be converted into plain text format.

Optional Arguments:


Displays the summary of command line options.


Displays the version information.


sval2plain displays the given SVAL2 file in plain text format with the contextual data of each instance on a separate line. Specifically, each i'th line displayed on STDOUT shows the context of the i'th instance in the given SVAL2 file.


 Ted Pedersen, University of Minnesota, Duluth
 tpederse at d.umn.edu

 Amruta Purandare, University of Pittsburgh


Copyright (c) 2002-2008, Ted Pedersen and Amruta Purandare

This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program; if not, write to

 The Free Software Foundation, Inc.,
 59 Temple Place - Suite 330,
 Boston, MA  02111-1307, USA.
syntax highlighting: