2

If you need to share a PDF document containing non-disclosable images you can use the draft option

\documentclass[draft]{book}

Is it possbible to do something similar for text? For debug purposes I have to share a PDF containing non-disclosable text. Any method could be fine (scrambling letters, covering letters with boxes --provided that underlining text cannot be copied--, etc.), as long as the layout is (almost) identical to the regular PDF.

I'm looking for a solution working with XeLaTeX documents.

mmj
  • 1,702
  • 1
    covering with boxes would not be enough it would still be trivial to extract the text from the pdf, you need to redact the text replacing it by boxes this would be much easier in luatex than xetex, do you really have a requirement to use xetex? – David Carlisle Jan 08 '24 at 21:01
  • @DavidCarlisle The PDF I need to obfuscate is produced by a XeLaTeX source, so I would say that XeLaTeX is a requirement, at least unless I convert it to LuaTex (which I would not know how to). – mmj Jan 08 '24 at 21:06
  • most xelatex documents work unchanged in lualatex, unless you are using xetex character classes – David Carlisle Jan 08 '24 at 21:11
  • 2
    the safest thing to do is just make the pdf as usual then redact the text using a pdf editor such as acrobat pro – David Carlisle Jan 08 '24 at 21:12
  • 1
    "as long as the layout is identical to the regular PDF". This is the difficult part. Any obfuscation will have a hard time hyphenating at line ends in a manner identical the original text. The censor package can do some of what you ask, but not this central point you require. – Steven B. Segletes Jan 09 '24 at 01:01
  • @StevenB.Segletes With identical I don't mean at hypen level, it is sufficient that differences in text do not badly affect positioning of images and sectioning. I tried package censor by wrapping a chapter between \blackoutenv and \endblackoutenv, but I get the error pbox.sty not found. – mmj Jan 09 '24 at 08:37
  • @mmj https://ctan.org/pkg/pbox – Steven B. Segletes Jan 09 '24 at 15:16
  • Take @DavidCarlisle 's advice. Your best best is to process normally with XeLaTeX, then load the PDF into Adobe Acrobat Pro and redact it. The typography will remain the same, even though it is now "produced" by AAP. – rallg Jan 09 '24 at 16:16

1 Answers1

2

If you are willing to put a lot of effort into it, this brute-force method might help. Note that it does not understand line breaks (hyphenation), but you could compile, note the beaks, and subtitute the fragmented text. Also, for whole lines you could simply use a rule at textwidth there. This is compiler-agnostic.

\documentclass{article}
\newif\ifredacted
\redactedtrue % or false
\newsavebox\hideme
\def\blackout{\ifredacted\rule{\wd\hideme}{.5em}\else\usebox\hideme\fi\obeyspaces}
\begin{document}
\sbox\hideme{Battalion 43}
We will be sending \blackout to the island.\par
\sbox\hideme{04h36m tomorrow}
They arrive \blackout.\par
\end{document}
rallg
  • 2,379
  • Your solution is nice and usable if you need sparse hidings, but since I need to hide almost all the text of an entire (and long) book with images, tables, minipages, verbatims, I would need some command like \startblackout robust enough to be added more or less at the beginning of the document and another command \endblackout to be added close to the end of document. – mmj Jan 09 '24 at 22:26
  • @mmj In that case, the best solution is to redact using Adobe Acrobat Pro. If you are at a university, its publications office might have it there, even if it is not available elsewhere on campus. If someone's job is to create glossy ads for the university, that's whom you should ask. Or, I believe it is possible to "rent" the program for a month, from Adobe. Not sure, though. You would need the correct platform. – rallg Jan 10 '24 at 02:27