Jeremy Kahn > Lingua-Treebank-0.16 > list-rewrites

Download:
Lingua-Treebank-0.16.tar.gz

Annotate this POD

CPAN RT

New  3
Open  0
View/Report Bugs
Source  

NAME ^

  list-rewrites - reads penn treebanks, prints out all rewrites found

SYNOPSIS ^

  list-rewrites [options] [file ...]

  Options:
     -help        brief help message
     -man         full documentation
    --verbose     more verbose to STDERR
    --directinput allow TTY to STDIN

    --format FORMAT provide a different output format

    --terminal    include (exclude) terminal expansions
    --noterminal  default is --terminal

Sample output

  $ echo "(S (NP (DET the) (NN dog)) (VP ran))" | ./list-rewrites
  S => NP VP
  NP => DET NN
  DET => the
  NN => dog
  VP => ran

OPTIONS ^

--help
-?

Show this help message.

--man

Show the manual page for this script.

--directinput

By default, if there is a human-operated TTY on STDIN, this script issues a usage message and exits (this is so users can run list-rewrites and get the usage message). If you really want to type trees by hand on STDIN, add the --directinput flag.

--verbose

Repeatable option. Report more of what we're doing.

--format FORMAT

provide an alternative output format. The default is %s = %s\n>, which creates output like the example in "Sample output".

DESCRIPTION ^

This program lists all rewrites in all trees presented by file or on STDIN to this script.

CAVEATS

The trees must be in Penn treebank format.

The rewrites will not necessarily be unique; if you want them to be unique, you will have to pipe the output of this program into (e.g.) sort | uniq. This is deliberate, so that you can get counts from the output of this program as well as a survey of the rewrites in a corpus.

TO DO

None that I know of.

AUTHOR ^

Jeremy G. Kahn <jgk@ssli.ee.washington.edu>

syntax highlighting: