Someone wants to convert PDF to TXT as explained in C# GhostScript - Not able to successfully convert from PDF to TXT file.
I am not sure whether GhostScript can do that. Is it possible?
Someone wants to convert PDF to TXT as explained in C# GhostScript - Not able to successfully convert from PDF to TXT file.
I am not sure whether GhostScript can do that. Is it possible?
command line tool pdftotext, part of the Xpdf open source project
NAME
pdftotext - Portable Document Format (PDF) to text converter (version 3.00)
SYNOPSIS
pdftotext [options] [PDF-file [text-file]]
DESCRIPTION
Pdftotext converts Portable Document Format (PDF) files to plain text.
Pdftotext reads the PDF file, PDF-file, and writes a text file, text-file. If text-file is not specified, pdftotext con-
verts file.pdf to file.txt. If text-file is ´-', the text is sent to stdout.
further details in the pdftotext man page
Yes. It is absolutely possible. Using the following batch file.
rem batch.bat
rem %1 represents input file name without extension.
echo off
gswin32c -q -dNODISPLAY -dSAFER -dDELAYBIND -dWRITESYSTEMDICT -dSIMPLE -c save -f ps2ascii.ps %1.pdf -c quit >%1.txt
-layoutoption. You might want to disable page numbers etc. (i.e.\pagestyle{empty}). – Martin Scharrer Jul 15 '11 at 16:50