README.Toolkit - SenseClusters Toolkit directory structure with links to all program documentation


This briefly describes the structure of the Toolkit directory, and gives a brief idea of what each program does. Directories are indicated with a / at the end of their name (preprocess/) while programs end with the .pl suffix. All of this is contained in the Toolkits/ directory. Note that these are organized roughly in the order in which they will be used by SenseClusters.

Please review the flowcharts found in doc/Flowcharts for additional information.

preprocess/ (text preprocessing programs)

count/ (Modify output from Text-NSP)

matrix/ - (Similarity matrix constructors)

vector/ (Represent contexts as vectors to be clustered)

svd/ (SVDPACKC interface)

clusterstopping/ (Cluster Stopping program)

evaluate/ (Evaluate the results of SenseClusters by comparing to gold standard data)

clusterlabel/ (Cluster Labeling programs)


 Ted Pedersen, University of Minnesota, Duluth
 tpederse at


Copyright (c) 2003-2008, Ted Pedersen

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts.

Note: a copy of the GNU Free Documentation License is available on the web at and is included in this distribution as FDL.txt.

