How can I generate both PDF and text version with a single .tex file? Text version requirements documented below.
Input
For example, given this .tex file (texlive.net PDF generator online):
\documentclass[11pt,a4paper]{article}
% I doubt this will affect the PDF to text solution,
% but I've included it to make it as similar as possible to my real document.
\usepackage[
paperheight=11.00in,
paperwidth=8.50in,
margin=1.00in,
top=1.00in,
left=1.00in,
bottom=1.00in
]{geometry}
\usepackage[hidelinks]{hyperref}
% This part is \input from another file.
% Included inline for your convenience.
\hypersetup{
pdfinfo={
Author={tfstwbbnb},
}
}
\newcommand{\authorName}{tfstwbbnb}
% xelatex required, pdflatex does not work
% \setmainfont{Ubuntu Light}[
% ItalicFont=Ubuntu Light Italic,
% BoldFont=Ubuntu,
% BoldItalicFont=Ubuntu Italic,
% ]
\setlength\parindent{0pt}
\pagenumbering{gobble}
\usepackage{xcolor}
\newcommand{\gray}[1]{\textcolor{gray}{#1}}
\usepackage{setspace}
\setstretch{1.10}
% https://tex.stackexchange.com/a/50510
\newcommand{\fitline}[1]{\makebox[\linewidth][s]{#1}}
\newcommand{\myInnerSpacing}{0.40\baselineskip}
\hypersetup{
pdfinfo={
Title={tfstwbbnb demo},
}
}
\newcommand{\optionalOne}{optionalOne}
\newcommand{\optionalTwo}{optionalTwo}
% Links should appear as link text ("requiredOne") in text version.
\newcommand{\requiredOne}{\href{mailto:invalid@example.com}{requiredOne}}
\newcommand{\requiredTwo}{requiredTwo}
\begin{document}
% Alignment in text version does not matter to me. Can be left-justified or centered.
\begin{center}
\LARGE{\textbf{Title}}
\end{center}
\vspace{\myInnerSpacing}
\optionalOne \\
% Optionals might be commented out like so:
% \optionalTwo \\
\requiredOne \\
\gray{\requiredTwo} \\
\vspace{\myInnerSpacing}
% Text formmatting should be stripped in text version.
\textbf{Lorem ipsum dolor sit amet}, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. \\
Purus semper eget duis at tellus at. Tellus cras adipiscing enim eu turpis egestas pretium aenean. \\
Felis donec, \\
tfstwbbnb
\end{document}
Expected
How can I have it output (as plain text):
Title
optionalOne
requiredOne
requiredTwo
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.
Purus semper eget duis at tellus at. Tellus cras adipiscing enim eu turpis egestas pretium aenean.
Felis donec,
tfstwbbnb
Copy and Paste
Opening up the PDF in a viewer and copy/paste gives:
TitleoptionalOnerequiredOnerequiredTwoLorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt utlabore et dolore magna aliqua.Purus semper eget duis at tellus at. Tellus cras adipiscing enim eu turpis egestas pretium aenean.Felis donec,tfstwbbnb
pdftotext
pdftotext gives better results, but still not as I want (missing newlines, too many newlines, extra 0x0c character at end):
Title
optionalOne
requiredOne
requiredTwo
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut
labore et dolore magna aliqua.
Purus semper eget duis at tellus at. Tellus cras adipiscing enim eu turpis egestas pretium aenean.
Felis donec,
tfstwbbnb
PDF to text summary
Essentially, the plaintext output should be:
- All coloring (
\textcolor{...}) ignored - All font sizing (
\LARGE,\small) ignored - All links (
requiredOne) displayed as text - All explicit newlines kept (for example, between the
optionalOneandrequiredOne) - All paragraphs kept (for example, between the
Lorem ipsum ...andPurus semper ...)

AddToHook{something}, but I don't know where to look for that "something". – tfstwbbnb Nov 04 '22 at 20:37texdoc ltpara-doc– David Carlisle Nov 04 '22 at 21:22\\to be sayEOL\newlinealthough it's harder than it should be due to the mis-used\\in the source which are an errorUnderfull \hbox (badness 10000) in paragraph at lines 76--77– David Carlisle Nov 04 '22 at 21:29\\toEOL\newlinewould help with the "ut" in the middle of a paragraph being split with a newline? Anyway, this particular problem can be solved withpaperwidth=1000in. I will accept your solution shortly. – tfstwbbnb Nov 04 '22 at 22:09