Search results for "dist:Text-PDF2XML TIEDEMANN"
Text::PDF2XML - extract text from PDF files and wraps it in XML
Extract text from PDF using external tools and some post-processing heuristics. Here is an example with and without post-processing: raw: <p>PRESENTATION ET R A P P E L DES PRINCIPAUX RESULTATS 9</p> clean: <p>PRESENTATION ET RAPPEL DES PRINCIPAUX RE...
TIEDEMANN/Text-PDF2XML-0.3.3 - 11 Feb 2019 14:54:41 UTC
pdf2xml - extract text from PDF files and wraps it in XML
pdf2xml tries to combine the output of several conversion tools in order to improve the extraction of text from PDF documents. Currently, it uses pdftotext, Apache Tika and pdfxtk. In the default mode, it calls all tools to extract text and pdfxtk is...
TIEDEMANN/Text-PDF2XML-0.3.3 - 11 Feb 2019 14:54:41 UTC