*******************************************************************************
README.txt FOR Testing wordvec.pl
Version 0.3
Copyright (C) 2002-2004
Ted Pedersen, tpederse@umn.edu
Amruta Purandare amruta@cs.pitt.edu
University of Minnesota, Duluth
*******************************************************************************
Testing for wordvec.pl
------------------------
AMRUTA PURANDARE
amruta@cs.pitt.edu
05/31/2004
1. Introduction:
----------------
This program is a component of the SenseClusters package that constructs
word vectors. The scripts and files provided here could be used to test
the correct behaviour of the program and backward compatibility.
2. Tests:
----------
2.1 Normal conditions:
----------------------
Tests written in testA*.sh test wordvec.pl under normal conditions.
Tests A1-A10 test wordvec when the feature file does not exist and is to be
automatically created by wordvec, while tests A11-20 run the same tests as
A1-A10 but when the features file is provided by the user.
Test A1 :
Test A11 : Tests wordvec when input is created by combig
Test A2 :
Test A12 : Tests wordvec when input is created by count
Test A3 :
Test A13 : Tests wordvec when input is created by statistic
Test A4 :
Test A14 : Tests wordvec when bigrams include punctuations like
period, comma
Test A5 :
Test A15 : Tests wordvec on Hindi transliterated data
Test A6 :
Test A16 : Tests wordvec when each token in a bigram is a word pair
Test A7 :
Test A17 : Tests wordvec on data containing phone nos and email ids
Test A8 :
Test A18 : Tests wordvec's --binary option
Test A9 :
Test A19 : Tests --extarget option in wordvec
Test A10 :
Test A20 : Simple test added after adding sparse support. Uses sample
bigrams from Serve data
Each of the above tests actually runs several tests that test options
--wordorder and --dense internally within the test. Expected test results
that end with
1. test-A*a*.reqd - run wordvec with --wordorder = follow
2. test-A*b*.reqd - run wordvec with --wordorder = precede
3. test-A*c*.reqd - run wordvec with --wordorder = nocare
4. test-A*1.reqd - run wordvec with --dense
5. test-A*2.reqd - run wordvec without --dense
2.2 Error conditions:
----------------------
Tests written in testB*.sh test wordvec.pl under error conditions.
Test B1: Tests wordvec under the floating point over/under flow errors.
3. Conclusions:
---------------
We have tested program wordvec.pl enough to conclude that it runs correctly.
We have also provided the test scripts so that future versions of
wordvec.pl can be compared to the current version against these scripts.