The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.

Search results for "dist:Text-PDF2XML TIEDEMANN"

Text::PDF2XML - extract text from PDF files and wraps it in XML River stage zero No dependents

Extract text from PDF using external tools and some post-processing heuristics. Here is an example with and without post-processing: raw: <p>PRESENTATION ET R A P P E L DES PRINCIPAUX RESULTATS 9</p> clean: <p>PRESENTATION ET RAPPEL DES PRINCIPAUX RE...

TIEDEMANN/Text-PDF2XML-0.3.3 - 11 Feb 2019 14:54:41 UTC

pdf2xml - extract text from PDF files and wraps it in XML River stage zero No dependents

pdf2xml tries to combine the output of several conversion tools in order to improve the extraction of text from PDF documents. Currently, it uses pdftotext, Apache Tika and pdfxtk. In the default mode, it calls all tools to extract text and pdfxtk is...

TIEDEMANN/Text-PDF2XML-0.3.3 - 11 Feb 2019 14:54:41 UTC
2 results (0.022 seconds)