0

My Situation:

I am using glossaries package to manage my acronyms. They are all neatly defined in one file with a list generated automatically. But I have to confess: I was very inconsistent in using \gls{} inside the main document.

  • Now, I ended up with a structured acronyms.tex:
\newacronym{rx}{RX}{receive}
[...]
  • But a chaotic main.tex (example made up):
The \gls{rx} channel is the receiving channel, 
but only if Rx-buffer has RX capacity. 
Also receive channels have \emph{rx}-flags [...]
  • That is, acronyms may be either correctly escaped, the acronym in any caps or written out. I want all to be correctly escaped.
  • Additionally, the acronyms may occur inside other words. Those should stay the same.

Question:

Is there a quick solution to clean up the messed up main.tex?

Preferably automatically, using the strucutred data in acronyms.tex and with plain pdflatex / latexmk.

Some Ideas:

I got pretty far using find/replace with regex, but it is very tedious and often missing edge cases.

Fully automating the process seems hard, see: Typesetting acronyms without explicitly marking them. But detection of non-\gls{} entries with not too many false positives would already do the job for me.

Skimming acro and glossaries manuals, I only find ways to automate the list of abbreviations and the index (700 pages, I could very well have missed something).

There is a very similar answer. Also others that use LuaLaTeX and XeLaTeX. So probably it is possible to hack something together in lua/regex. Unfortunately, I only know Python...

So before I reinvent the wheel, I would like to know if anyone already faced this problem.

  • 2
    pdftex has essentially no access to the text, so if using pdftex a regex edit to fix your source is the only practical option. If using luatex you could use the input buffer callback to do lua string replace on the fly, but I think fixing the source would be preferable – David Carlisle Apr 07 '23 at 09:37
  • 2
    Imho searching and replacing in the source in a semi-automatically way (with regex) is the only sensible way. – Ulrike Fischer Apr 07 '23 at 09:40
  • yes, I fixing the source is much preferred, because maybe others need to build the project in the future. I am not sure about switching engines atm. So other way would be better. @UlrikeFischer I do so, but I tend to miss many edge cases, like start of line, hyphens, colons, \emph{} around the word. Maybe I try to write a Python script to aid in that. – Paul Smith Apr 07 '23 at 09:56
  • 1
    well I would use some grep to get a list of "rx/RX/Rx" and go through it to handle the cases. – Ulrike Fischer Apr 07 '23 at 10:06

0 Answers0