2

I use l e t t e r s p a c i n g provided by soul/soulutf8 to emphasize words. But

  • If I try to copy a spaced word from the pdf-file, then I get "w o r d" instead of "word".

  • Also the search for "word" in pdf-file does not return the spaced word.

Is there a trick to fix this problem?

  • 2
    with soul imho the only thing that could work is to add an /ActualText with accsupp. With lualatex, microtype and fontspec it can work out of the box if the pdf is tagged and real space chars are used, but not in all pdf viewers. – Ulrike Fischer Jan 20 '22 at 21:33
  • @UlrikeFischer Thank you, I am staring at https://ctan.math.washington.edu/tex-archive/macros/latex/contrib/axessibility/axessibility.pdf but it does not help so far. Could you give me a more precise hint? – Anton Petrunin Jan 20 '22 at 23:06
  • 1
    No that won't work. The only one that works is tagpdf. But see https://tex.stackexchange.com/a/605142/2388 – Ulrike Fischer Jan 20 '22 at 23:16
  • @UlrikeFischer, sorry, still could you say what has to be written instead of \so{word}? – Anton Petrunin Jan 21 '22 at 00:51
  • 1
    Search this site for uses of accsupp, such as https://tex.stackexchange.com/questions/233390/in-which-way-have-fake-spaces-made-it-to-actual-use/233397#233397 and https://tex.stackexchange.com/questions/198516/is-there-such-thing-as-visual-only-whitespace – Steven B. Segletes Jan 21 '22 at 03:13
  • @StevenB.Segletes Thank you very much, I will use it to answer my question, so it will be removed from unanswered. – Anton Petrunin Jan 21 '22 at 04:46
  • @AntonPetrunin: To mark the question as solved, you may mark the answer as accepted (green checkmark), including your own answer. Also, please see my suggestion concerning the letterspace package. – marquinho Jan 21 '22 at 23:35

2 Answers2

3

The problem is solved, thanks to Steven B. Segletes:

\documentclass{article}
\usepackage{accsupp}
\usepackage{soulutf8}

\newcommand\an[1]{% \BeginAccSupp{method=escape,ActualText={\detokenize{#1}}}% \so{#1}% \EndAccSupp{}% }

\def\emph{\an}

\begin{document} Try to copy this \emph{word} --- you will see word'', but not\emph{word}''. \end{document}

3

You've already found a satisfactory solution by (1) using accsupp and also (2) reducing the tracking (=intraword spacing) from a whopping 0.25em (\textso's default) to 0.1em. (In my experience, no. (2) by itself can solve the search and copy issue with most PDF readers.)

In addition, I would like to suggest abandoning soul(utf8) in favor of the superior letterspacing tool from pkg microtype (available separately as pkg letterspace). Particularly if you have to deal with long portions of text, indexing commands, or the like.

soulutf8, despite the recent updates and the Unicode compatibility, still suffers from some limitations of the old soul package. \textso interacts poorly with some other macros. Of the more common ones, using \index or \cite in the argument aborts compilation, while using \mbox or math mode breaks the letterspacing functionality (examples below). There are workarounds, but they can get pretty ugly: in the last example, you'd need to "close" and "reopen" \textso every time, which in turn makes for too short interword spaces.

letterspace handles these cases easily, besides being more flexible in choosing the amount of letterspacing (case by case or globally). And of course, it can be combined with accsupp as in Steven Segletes' and your approach.

letterspacing examples

\documentclass{article}

\usepackage{letterspace} % or microtype \usepackage{soulutf8}

\begin{document}

\textso{triangle $ABC$} (\verb|\textlso|)

\textls[250]{triangle $ABC$} (\verb|\textls|)

\bigskip

\textls{foo \mbox{bar} \index{baz}baz (\cite{mybook})} (\verb|\textls|, mbox index and cite succeed)

\textls[250]{foo bar baz} (\verb|\textls|)

%\textso{foo bar \index{baz}baz} (\verb|\textlso|) % doesn't compile! %\textso{foo bar baz \cite{mybook}} (\verb|\textlso|) % doesn't compile!

\textso{foo \mbox{bar} \protect\index{baz}baz (\protect\cite{mybook})} (\verb|\textlso|, mbox index and cite prevent letterspacing)

\textso{foo} \mbox{\textso{bar}} \index{baz}\textso{baz} (\cite{mybook}) (\verb|\textlso|, requires some contorsions, messes up interword spaces)

\end{document}

marquinho
  • 1,721