The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.
THESE TEST CASES ORIGINALLY BELONGS TO THE SenseTools PACKAGE 
(http://www.d.umn.edu/~tpederse/sensetools.html) DEVELOPED 
BY SATANJEEV BANERJEE AND DR. TED PEDERSEN. IT HAS BEEN 
INCLUDED IN SenseClusters DISTRIBUTION FOR CONVENIENCE REASONS.

Testing for nsp2regex.pl:
-------------------------

Satanjeev Banerjee
bane0025@d.umn.edu

2001-10-28

Ted Pedersen
tpederse@umn.edu 

2003-05-10             (changed name from bsp2regex to nsp2regex)

1. Introduction: 
----------------

We have tested nsp2regex.pl, a component of SenseTools version
0.1. Following is a description of the aspects of nsp2regex.pl
that we have tested. Also provided below is an inventory of the
various files in this directory (SenseTools-0.1/Testing/nsp2regex),
and the role of each file. We provide the scripts and files used for
testing so that later versions of preprocess.pl can be tested for
backward compatibility. 


2. Phases of Testing: 
---------------------

We have divided the testing into two main phases of testing: 

Phase 1: Testing of commandline options
Phase 2: Evaluation of execution time on big files.


2.1 Phase 1 of Testing: Testing of Commandline Options:
-------------------------------------------------------

Following are the various subtests involved in this test:

Subtest  1: Testing nsp2regex.pl without a token file on bigrams with 
	    default window size.
Subtest  2: Testing nsp2regex.pl with a token file on bigrams with
	    default window size.
Subtest  3: Testing nsp2regex.pl without a token file on bigrams with 
	    window size = 3.
Subtest  4: Testing nsp2regex.pl with a token file on bigrams with
	    window size = 3.
Subtest  5: Testing nsp2regex.pl without a token file on bigrams with 
	    window size = 10.
Subtest  6: Testing nsp2regex.pl with a token file on bigrams with
	    window size = 10.
Subtest  7: Testing nsp2regex.pl without a token file on trigrams with 
	    default window size.
Subtest  8: Testing nsp2regex.pl with a token file on trigrams with
	    default window size.
Subtest  9: Testing nsp2regex.pl without a token file on trigrams with 
	    window size = 4.
Subtest 10: Testing nsp2regex.pl with a token file on trigrams with
	    window size = 4.
Subtest 11: Testing nsp2regex.pl without a token file on trigrams with 
	    window size = 10.
Subtest 12: Testing nsp2regex.pl with a token file on trigrams with
	    window size = 10.
Subtest 13: Testing nsp2regex.pl without a token file on bigrams that
	    contain characters that need to be escaped. 
Subtest 14: Testing nsp2regex.pl with a token file on bigrams that
	    contain characters that need to be escaped.

2.1.1. Files involved for these sub-tests: 
------------------------------------------

The source file for Subtest 2i-1: sub-i.source
The required output file for Subtest 2i-1: sub-i.source.reqd

The source file for Subtest 2i: sub-i.source
The required output file for Subtest 2i: sub-i.token.reqd

1 <= i <= 7

2.2 Details of Phase 2: Evaluation of Execution Time on Big Files:
------------------------------------------------------------------

Run on following architecture: Sun Ultra 5 running SunOS 5.8. 

time nsp2regex.pl big.bigram > big.regex
13.620u 0.440s 0:15.93 88.2%	0+0k 0+0io 316pf+0w

wc output on big.bigram:

137376  412122 3154593

3. Conclusion:
--------------

The major features of nsp2regex.pl have been tested. Several types of
inputs have been used for this testing, including some "borderline"
cases. This is version 0.1... these tests can be used to check for
backward compatibility of future versions of nsp2regex.pl