Siddharth Patwardhan > WordNet-Similarity-1.04 > treebankFreq.pl

Download:
WordNet-Similarity-1.04.tar.gz

Annotate this POD

CPAN RT

New  6
Open  2
View/Report Bugs
Source   Latest Release: WordNet-Similarity-2.05

NAME ^

treebankFreq.pl - Perl program for finding the frequencies of words in the Treebank corpus

SYNOPSIS ^

treebankFreq.pl [--compfile=COMPFILE --outfile=OUTFILE [--stopfile=STOPFILE] [--wnpath=WNPATH] [--resnik] [--smooth=SCHEME] PATH | --help -- version]

OPTIONS ^

--compfile=filename

    The name of a file containing the compound words (collocations) in
    WordNet

--outfile=filename

    The name of a file to which output should be written

--stopfile=filename

    A file containing a list of stop listed words that will not be
    considered in the frequency counts.  A sample file can be down-
    loaded from
    http://www.d.umn.edu/~tpederse/Group01/WordNet/words.txt

--wnpath=path

    Location of the WordNet data files (e.g.,
    /usr/local/WordNet-2.1/dict)

--resnik

    Use Resnik (1995) frequency counting

--smooth=SCHEME

    Smoothing should used on the probabilities computed.  SCHEME can
    only be ADD1 at this time

--help

    Show a help message

--version

    Display version information

PATH

    Path to the raw Wall Stree Journal portion of the Treebank corpus.
    This is usually in the /raw/wsj subdirectory of the Treebank
    installation.  Thus, you might run this program as

        treebankFreq.pl [OPTIONS] /home/sid/treebank/raw/wsj
syntax highlighting: