Benchmark::Featureset::StopwordLists - Compare various stopword list modules
#!/usr/bin/env perl use Benchmark::Featureset::StopwordLists; Benchmark::Featureset::StopwordLists -> new -> run;
See scripts/stopwordlists.report.pl. This outputs HTML to STDOUT.
Hint: Redirect the output of that script to your $doc_root/stopwordlists.report.html.
A copy of the report ships in html/stopwordlists.report.html.
Benchmark::Featureset::StopwordLists compares various stopword list modules.
The list of modules processed is shipped in data/module.list.ini, and can easily be edited before re-running:
shell> scripts/copy.config.pl shell> scripts/stopwordlists.report.pl
The config stuff is explained below.
This module is available as a Unix-style distro (*.tgz).
See http://savage.net.au/Perl-modules/html/installing-a-module.html for help on unpacking and installing distros.
Install Benchmark::Featureset::StopwordLists as you would for any
sudo cpan Benchmark::Featureset::StopwordLists
or unpack the distro, and then either:
perl Build.PL ./Build ./Build test sudo ./Build install
perl Makefile.PL make (or dmake or nmake) make test make install
All that remains is to tell Benchmark::Featureset::StopWwordLists your values for some options.
For that, see config/.htbenchmark.featureset.stopwordlists.conf.
If you are using Build.PL, running Build (without parameters) will run scripts/copy.config.pl, as explained next.
If you are using Makefile.PL, running make (without parameters) will also run scripts/copy.config.pl.
Either way, before editing the config file, ensure you run scripts/copy.config.pl. It will copy the config file using File::HomeDir, to a directory where the run-time code in Benchmark::Featureset::StopwordLists will look for it.
shell>cd Benchmark-Featureset-StopwordLists-1.00 shell>perl scripts/copy.config.pl
Under Debian, this directory will be $HOME/.perl/Benchmark-Featureset-StopwordLists/. When you run copy.config.pl, it will report where it has copied the config file to.
Check the docs for File::HomeDir to see what your operating system returns for a call to my_dist_config().
The point of this is that after the module is installed, the config file will be easily accessible and editable without needing permission to write to the directory structure in which modules are stored.
Although this is a good mechanism for modules which ship with their own config files, be advised that some CPAN tester machines run tests as users who don't have home directories, resulting in test failures.
new() is called as
my($builder) = Benchmark::Featureset::StopwordLists -> new(k1 => v1, k2 => v2, ...).
It returns a new object of type
Key-value pairs in accepted in the parameter list (see corresponding methods for details):
For use by subclasses.
Does the real work.
See scripts/stopwordlists.report.pl and its output html/stopwordlists.report.html.
Hint: Redirect the output of that script to $doc_root/stopwordlists.report.html.
Templates ship in htdocs/assets/templates/benchmark/featureset/stopwordlists/.
See also htdocs/assets/css/benchmark/featureset/stopwordlists/.
By searching MetaCPAN.org for phrases like 'stopword' and 'stop word'.
One set of module comparison reviews, by Neil Bowers, is here.
And another set of module comparison reviews, by Ron Savage, is here.
The file Changes was converted into Changelog.ini by Module::Metadata::Changes.
Version numbers < 1.00 represent development versions. From 1.00 up, they are production versions.
Email the author, or log a bug on RT:
Benchmark::Featureset::StopwordLists was written by Ron Savage <email@example.com> in 2012.
Home page: http://savage.net.au/index.html.
Australian copyright (c) 2012, Ron Savage.
All Programs of mine are 'OSI Certified Open Source Software'; you can redistribute them and/or modify them under the terms of The Artistic License, a copy of which is available at: http://www.opensource.org/licenses/index.html