1

A follow up to this question.

For this and quite a few other fonts, the small cap glyphs show as uppercase when copied, as in line 1 and 4 in the output below. I am trying to override each individual unicode value to be the same in the output as in the input, and do the same for commands expanded as far as possible. Maybe I am misunderstanding how expansion works.

  1. Is \@tfor the right tool for this? Can it be made to respect spaces while still affecting each character individually?

  2. accsupp seems to have more levels of expansion. Can this be extended? What is the proper use of expansion commands? Is there a command to expand until there are no more commands to expand?

Output

MULTIPLE WORDS OCTOBER 11, 2018
MultipleWords\today 
MultipleWords\today 
MULTIPLE WORDS OCTOBER 11, 2018
MultipleWordsOctober 11, 2018
MultipleWordsOctober 11, 2018

MWE

\documentclass{report}

\usepackage{fontspec}
\setmainfont{SourceSerifPro-Regular.otf}

\usepackage{tagpdf}
\tagpdfsetup{uncompress,activate-all}

\usepackage{accsupp}

\makeatletter

\DeclareRobustCommand*{\actualtext}[1]{{%
    \@tfor\next@letter:=#1\do{%
        \tagmcbegin{tag=Span,actualtext-o=\next@letter}%
        \next@letter%
        \tagmcend%
    }%
}}%

\DeclareRobustCommand{\PDFreplace}[1]{{%
    \@tfor\next@letter:=#1\do{%
        \BeginAccSupp{method=escape,ActualText=\next@letter}%
        \next@letter%
        \EndAccSupp{}%
    }%
}}%

\makeatother

\begin{document}

{\scshape{Multiple Words \today}}

{\scshape\actualtext{Multiple Words \today}}

{\scshape\expandafter\actualtext\expandafter{Multiple Words \today}}

{\scshape{Multiple Words \today}}

{\scshape\PDFreplace{Multiple Words \today}}

{\scshape\expandafter\PDFreplace\expandafter{Multiple Words \today}}

\end{document}
gnucchi
  • 946
  • 1
    I don't understand what you want. You seem to set as Actualtext the letter itself. What is the purpose? What do you mean by level of expansion? Your expandafters are doing nothing, they "expand" the "M" of Multiple. – Ulrike Fischer Oct 11 '18 at 20:53
  • 1
    beside this: be careful with activate-all. You are not adding structures. – Ulrike Fischer Oct 11 '18 at 20:55
  • In this case, and for other fonts, the small cap glyphs show as uppercase when copied, as in line 1 and 4 in the output above. So you are right that it seems redundant, but it is not here. I am trying to override each individual unicode value to be the same in the output as in the input, and do the same for commands expanded as far as possible. (I really can't wrap my head around how expansion works with the docs I have found.) – gnucchi Oct 11 '18 at 21:25
  • You should better describe what you really want to do. I suspect that actualtext is not the answer but that you want to adjust tounicode values. – Ulrike Fischer Oct 11 '18 at 21:33

1 Answers1

4

I assume that your real goal is that lower case small caps copies as lower case. While this can be done in small parts with Actualtext, it won't work with arbitrary, perhaps not expandable text.

With luatex a better solution is to change the tounicode values:

\documentclass{report}

\usepackage{fontspec}
\setmainfont{SourceSerifPro-Regular.otf}
\usepackage{luacode}
\begin{luacode}

local scnames = { -- needs extension, can be done with a lua loop.
    ["A.sc"] = "0062",
    ["B.sc"] = "0062",
    ["C.sc"] = "0063",
    ["D.sc"] = "0064",
    ["E.sc"] = "0065",
    ["F.sc"] = "0066",
    ["G.sc"] = "0067",
    ["I.sc"] = "0069",
    ["L.sc"] = "006C",
    ["O.sc"] = "006F",
    ["P.sc"] = "0070",
    ["R.sc"] = "0072",
    ["S.sc"] = "0073",
    -- ....
    ["T.sc"] = "0074",
    ["U.sc"] = "0075",
    }


local patch_sccopy = function (fontdata)
   if fontdata.resources.unicodes then
    for k, v in pairs(fontdata.resources.unicodes) do
          if  scnames[k]  then
              fontdata.characters[v]["tounicode"] = scnames[k]
          end
    end
   end
end

luatexbase.add_to_callback
 (
  "luaotfload.patch_font",
   patch_sccopy,
  "change_copy_sclowercase"
 )
\end{luacode}
\begin{document}

{\scshape Multiple Words \today }

Copies as: Multiple Words October 12, 2018

\end{document}

I didn't check if other fonts uses A.sc as name for the upper case small caps A and so would now copy A wrongly as a. If this happens one would have to restrict the patch to specific fonts.

Ulrike Fischer
  • 327,261