The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

pdfgetext - get text from pdf and resort to ocr as needed

DESCRIPTION

Get all text out of a pdf, even from images.

This is basically a CLI interface to OCR::PDF::Thorough.

OPTION FLAGS

        -f force extracting images and running ocr even if pdftotext finds content
        -d debug on
        -o output file, abs path (text file) instead of STDOUT

EXAMPLE USAGE

Standard usage:

        pdfgetext /home/myself/brochure.pdf

If you want to save to a text file

        pdfgetext -o /home/myself/brochure.txt /home/myself/brochure.pdf

If you want to see extra debug info:

        pdfgetext -d /home/myself/brochure.pdf

Another way to save to a text file

        pdfgetext /home/myself/brochure.pdf > /home/myself/output

SEE ALSO

PDF::OCR PDF::OCR::Thorough PDF::API2

AUTHOR

Leo Charre leocharre at cpan dot org