4

I find myself in the awkward, although apparently not uncommon, position of being ordered to write, or at least end up with, documentation in the Word format.

I don't have anything that fancy in my LaTeX source file save the title page, which I don't particularly care about. I do however make extensive use of the logical markup capabilities of LaTeX, using things like \servername#1 and \ipaddress#1 and acronyms using the acro package.

I have no figures that I'm not willing to export with standalone and no fancy minipage action going on --- just straight LaTeX. Pandoc seemed to be my best bet, but it simply skipped/stripped my custom commands (\servername, things from acro) from the document completely. It would be a decent solution if this were not the case.

Does there exist a preprocessor that will attempt to expand macro definitions into their appropriate 'base calls'? (I'm talking about things like \def\servername#1{\texttt{#1}}). For example,

\documentclass{article}
\newcommand{\servername}[1]{\texttt{#1}}
\begin{document}
Hi!  My server is \servername{localhost}.
\end{document}

is converted to

\documentclass{article}
\begin{document}
Hi!  My server is \texttt{localhost}.
\end{document}
Sean Allred
  • 27,421

1 Answers1

3

I have a "proof of concept" for you. It would need a fair bit of work to be usable.

Background: I use LaTeX to write things like nlab posts. The mathematical handling there is a simplified LaTeX syntax but doesn't allow for \newcommand (or similar). Whilst this is eminently sensible, after writing \mathbb{R} twenty times you start longing for the ability to write \newcommand\R{\mathbb{R}}. So I've been working on a LaTeX package which converts a purpose-written LaTeX document into something that the software underlying the nLab can understand.

Now the important part for your situation is the fact that the mathematical handling is a simplified LaTeX syntax. This means that if I write \mathbb{R} then I want that to go through exactly as is. But if I put \newcommand\R{\mathbb{R}} then I want \R to expand to \mathbb{R} and then for that to go through. So I have a general system whereby one can declare which macros "go through".

Adapting this to your situation would simply mean supplying the list of macros (and their arguments, also environments work) that pandoc understands.

The code is currently on github and I've just added the proof-of-concept pandoc module and your sample as pandoc_test.tex. Not very surprisingly, pdflatex pandoc_test.tex produces a PDF containing exactly the desired output (which can then be converted to text via pdftotext).

It would, however, take a bit of work to add all the macros that pandoc understands to the list.

Andrew Stacey
  • 153,724
  • 43
  • 389
  • 751