SWISH::Filters::xls2txt - convert Excel docs to text using xls2csv
This is a plug-in module that uses the
xls2csv program to convert MS Excel documents to text for indexing by Swish-e.
xls2csv is part of the
catdoc package and can be downloaded from:
xls2csv must be installed and in your PATH.
This filter does not specify input or output character encodings.
A minor optimization during spidering (i.e. when docs are in memory instead of on disk) would be to use open2() call to let catdoc read from stdin instead of from a file.
Peter Karman email@example.com