2

Once I added \cprotect around my section titles (I need to do this because I could have math in some of them). Now when I open a PDF file, all bookmarks are lost, and in their place I see something cpt on each one and section and subsections names do not show up. I got an answer on this before to use \usepackage[bookmarks=false]{hyperref}, but at the time I did not know what this meant. Now I noticed it causes PDF books marks to get lost, so I can't use such a solution.

MWE

\documentclass[12pt]{book}
\usepackage{cprotect}
\usepackage{hyperref}

\begin{document}

\chapter{A}
\cprotect\section{B}
\cprotect\subsection{C}
stuff

\cprotect\subsection{D}
stuff

\end{document}

Compiled using lualatex

.....
Chapter 1.
(./foo3-1.cpt)

Package hyperref Warning: Token not allowed in a PDF string (Unicode):
(hyperref)                removing `\@ifnextchar' on input line 8.

(./foo3-2.cpt)

Package hyperref Warning: Token not allowed in a PDF string (Unicode):
(hyperref)                removing `\@ifnextchar' on input line 9.

(./foo3-3.cpt)

Package hyperref Warning: Token not allowed in a PDF string (Unicode):
(hyperref)                removing `\@ifnextchar' on input line 12.

And the PDF file bookmarks look like this (Adobe PDF reader)

Enter image description here

Removing cprotect and now it works:

\documentclass[12pt]{book}
\usepackage{hyperref}
\begin{document}

\chapter{A}
\section{B}
\subsection{C}
stuff

\subsection{D}
stuff

\end{document}

gives

Enter image description here

I want to use cprotect, but not lose PDF bookmarks. Is there a way to do this?

TL 2020 on Linux

Update to answer comment

Please edit your posting to give an actual example of a \section command that's causing grief.

Here is an example which fails on LuaTeX, unless I use \cprotect, and when I use \cprotect bookmarks are lost. This happens because I am using \usepackage{Baskervaldx} which I like the font

\documentclass[12pt]{book}

\usepackage{unicode-math}
\defaultfontfeatures{Scale=MatchLowercase}
\setmathfont{Asana Math}
\usepackage{Baskervaldx}

\usepackage{amsmath}
\usepackage{hyperref}

\begin{document}
\tableofcontents

\chapter{A}
\section{$\cos\left(  A+B\right)  $ and $\sin\left(  A+B\right)  $}%

\subsection{C}
stuff
\subsection{D}
stuff

\end{document}

Compile using LuaLaTeX gives

Package hyperref Warning: Token not allowed in a PDF string (Unicode):
(hyperref)                removing `math shift' on input line 15.

! Improper alphabetic constant.
<to be read again>
\math@bgroup
l.15 \section{$\cos\left(  A+B\right)  $ and $\sin\left(  A+B\right)  $}
                                                                      %
?

But if I use cprotect it works it compiles with no error, but no bookmarks now

\documentclass[12pt]{book}

\usepackage{unicode-math}
\defaultfontfeatures{Scale=MatchLowercase}
\setmathfont{Asana Math}
\usepackage{Baskervaldx}

\usepackage{amsmath}
\usepackage{hyperref}

\usepackage{cprotect}
\begin{document}
\tableofcontents

\chapter{A}
\cprotect\section{$\cos\left(  A+B\right)  $ and $\sin\left(  A+B\right)  $}%

\subsection{C}
stuff
\subsection{D}
stuff

\end{document}

gives

Enter image description here

I have many many such examples. Here is another

 \section{ this is $\zeta$ }%

gives

Package hyperref Warning: Token not allowed in a PDF string (Unicode):
(hyperref)                removing `math shift' on input line 15.

! Improper alphabetic constant.
<to be read again>
\mitzeta
l.15 \section{ this is $\zeta$ }
                              %
?

Please note that these all fail, because I am using the font

\usepackage{unicode-math}
\defaultfontfeatures{Scale=MatchLowercase}
\setmathfont{Asana Math}
\usepackage{Baskervaldx}

I could of course not use the above font, and then it will compile OK and bookmarks will remain there (but without the Math which is OK with me). So maybe I have to do this and use the above font, which I like, but having the bookmarks is more important. So this is an option if there is no other solution.

Is it possible to tell hyperref, if it finds something it can put in the bookmark, only for that section it is OK to replace it with .cpt, but not for everything?

The problem is that I pre-process the whole LaTeX file, and add \cprotect around each section and subsection just in case they have math in them. So now all bookmarks are lost.

I can not do this case by case, since I have 10's of thousands of such entries.

Note on error found and solution

This is too small to write in comment, so I am adding it here.

An error is generated due to wrong order of packages. It has nothing to do with luacode.

This fails

% !TEX TS-program = lualatex
\documentclass{book}

\usepackage{amsmath,mleftright}
\usepackage{unicode-math}
\usepackage{Baskervaldx}
\setmathfont{Asana Math}[Scale=MatchLowercase]
\usepackage{xcolor}
\usepackage[colorlinks,allcolors=blue,linktocpage]{hyperref}

\begin{document}

\section{Solve numerically the ODE $u''''+u=f$ using point collocation method}

test

\end{document}

Compiled using LuaLaTeX gives

t) (./foo3.out)
! Undefined control sequence.
\g__um_prime_font_cmd_tl ->\l__um_font

l.14 \section{Solve numerically the ODE $u''''+u=f$ using point collocation method}

?

The fix is to put \usepackage{Baskervaldx} after \setmathfont{Asana Math}, so the order becomes

\usepackage{amsmath,mleftright}
\usepackage{unicode-math}
\setmathfont{Asana Math}[Scale=MatchLowercase]
\usepackage{Baskervaldx}
\usepackage{xcolor}
\usepackage[colorlinks,allcolors=blue,linktocpage]{hyperref}

And now it compiles OK. It has nothing to do with math in section. Here is an example:

% !TEX TS-program = lualatex
\documentclass{book}
\usepackage{amsmath,mleftright}
\usepackage{unicode-math}
\usepackage{Baskervaldx}
\setmathfont{Asana Math}[Scale=MatchLowercase]

\usepackage{xcolor}
\usepackage[colorlinks,allcolors=blue,linktocpage]{hyperref}

\begin{document}

\section{test}

Solve $y''(x)-3 y(x) = -x^2$ over $x=0\ldots1$ with boundary conditions
$x(0)=0$ and $x(1)=0$ using piecewise linear trial functions.
\end{document}

It compiles with an error:

! Undefined control sequence.
\g__um_prime_font_cmd_tl ->\l__um_font

l.17 Solve $y''(
              x)-3 y(x) = -x^2$ over $x=0\ldots1$ with boundary conditions
?

Again, changing the order of packages, the error is gone. This is why I was getting some error testing Mico's nice code.

Nasser
  • 20,220
  • Please explain why you're using \cprotect. Having math material in the argument of \section is not a (valid) reason for needing to use \cprotect. – Mico Jun 07 '20 at 05:34
  • @Mico But if I have math in section title and do not use \cprotect I get errors on some math constructs. When I add \cprotect the error goes away, but I lose the bookmarks. I also do not want to use something like \texorpdfstring to rewrite each section title. So now I automatically add \cprotect around each section and subsection title for everything, just in case there is some math in them., – Nasser Jun 07 '20 at 05:37
  • If you're getting error messages (as opposed to warning messages) due to the presence of math material in the argument of \section, it must be because there are syntax errors in the math part. The purpose of \cprotect is to deal with verbatim material in "moving arguments" (sorry for the LaTeX jargon) of LaTeX commands, such as \section. Dealing with verbatim material is a topic that's entirely separate from dealing with math material. Please edit your posting to give an actual example of a \section command that's causing grief. – Mico Jun 07 '20 at 05:43
  • @Mico fyi, added an example. I asked about this before. I am just asking here of a way not to lose all the bookmarks. – Nasser Jun 07 '20 at 05:59
  • I suspect that using \cprotect in the current setting is simply an abuse of its purpose. Since you use LuaLaTeX, why don't you write a preprocessor routine that converts math parts to their plain text representations if they occur in the scope of \section, \subsection, etc? Incidentally, I sure hope that your claim that you have "10's of thousands" of such entries is intended as hyperbole; I shudder to think what a document with several thousand sectioning commands might look like... – Mico Jun 07 '20 at 06:22
  • @Mico it is not one document, there are 100's of documents, and each has thousands of pages and so on. But the core of the problem is using the font I like as I show above. If I do not use the font, the error goes away as I mentioned above and bookmarks remain there. – Nasser Jun 07 '20 at 06:24
  • 1
    @Mico the errors are because of the math. Hyperref doesn't like some setup done by unicode-math. – Ulrike Fischer Jun 07 '20 at 06:33
  • 1
    You shouldn't use \usepackage{Baskervaldx} with lualatex. That is a pdftex font package, it uses an unsuitable font encoding and it will break all sort of things. Add e.g. Grüße to see one of the problems. – Ulrike Fischer Jun 07 '20 at 15:22
  • @UlrikeFischer You shouldn't use \usepackage{Baskervaldx} with lualatex. That is a pdftex font package I did not know this, thanks for the info. Do you happen to know similar font that will work with lualatex? I liked this font. But it seems to cause more trouble than it worth it to use with lualatex. All this trouble is due to this font. I am trying to find another font to use, so I do not need to do all this just to compile my files. – Nasser Jun 08 '20 at 09:05
  • 1
    there is an opentype variant: https://ctan.org/tex-archive/fonts/baskervaldx/opentype. You can use it with \setmainfont{...}. – Ulrike Fischer Jun 08 '20 at 09:14

3 Answers3

6

I think that using \cprotect in the current context constitutes a pretty severe abuse of the macro. Moreover, as you've discovered, it doesn't work properly since the bookmarks for the pdf viewer program are no longer being generated correctly.

Since you're using LuaLaTeX, I would like to suggest that you pursue a different approach, viz., set up a Lua function which operates at a very early stage, i.e., before TeX starts its usual processing routines. By assigning the Lua function to LuaTeX's process_input_buffer preprocessor callback, it can sweep over all instances of \section, \subsection, and \subsubsection and automatically identify any and all instances of inline math material and place these instances in \texorpdfstring directives, in essence "sanitizing" the math expression for use hyperref's bookmarking routines. For instance,

\subsection{$x^2+y^2=z^2$}

will be replaced "on the fly" with

\subsection{\texorpdfstring{$x^2+y^2=z^2$}{x2+y2=z2}}

and

\section{$\cos\left(A+B\right)$ \textcolor{red}{and} $\sin\left(A+B\right)$}

will be replaced on the fly with

\section{\texorpdfstring{$\cos\left(A+B\right)$}{cos(A+B)} 
         \textcolor{red}{and}   
         \texorpdfstring{$\sin\left(A+B\right)$}{sin(A+B)}}

The code below provides two LaTeX utility macros and two Lua functions. The LaTeX macros are called \texorpdfOn and \texorpdfOff; they serve to activate and deactivate a Lua function called fix_headers. Upon activation, i.e., upon assignment to LuaTeX's process_input_buffer callback, fix_headers checks all input lines; each time it comes across an instance of \section, \subsection, or \subsubsection or its "starred" variants, the Lua function next checks if the argument of that command contains inline math material by searching for pairs of the character $. If a match occurs, a subsidiary Lua function called strip_math is called to generate one or more instances of

\texorpdfstring{$<unmodified math>$}{<sanitized math>}

inside the arguments of \section, \subsection, etc.

The input requirements are as follows:

  • Every sectioning command and its argument must be on the same input line. This is definitely the most stringent requirement.

  • In any given line of input, there is at most one instance of \section, \subsection, or \subsubsection or of one of the starred variants of these commands. (This is probably more a general input sanity check. However, I thought I should mention it anyway.)

  • There are no instances of verbatim material which contain sectioning instructions which, in turn, contain inline-math material. E.g., no instances of \verb+\subsection{$1+1=2$}+. (This could be relaxed by excluding all inline-verbatim material and the contents of environments such as verbatim, Verbatim, and comment from further processing; please pose a new question if this is a concern in practice. Alternatively, run \texorpdfOff just before reaching the verbatim material. Later, upon exiting the verbatim material, you may run \texorpdfOn again.)

  • There are no commands named \Xsection, \xyzsection etc. in the document (This requirement is imposed mostly for programming convenience. If need be, this requirement could be relaxed without too much extra work.)

  • The arguments of \chapter and \chapter* do not contain inline math material. (This requirement could also be relaxed without too much extra work.)

  • The $ character is used to delimit inline math material in the sectioning headers. (Instances of \$, which are used to typeset the $ symbol itself, are permitted.)

  • There is no display-math material in the arguments of \section, \subsection, etc. In particular, there are no instances of $$ in the arguments \section, \subsection, etc.

  • Nested \frac expressions are not allowed. Non-nested \frac expressions are ok, though. Non-nested expressions of the form \frac{<numer>}{<denom>} are displayed in the bookmark as (<numer>)/(<denom>).

I will keep my fingers crossed that these input requirements aren't too burdensome.


enter image description here

% !TEX TS-program = lualatex
%% (compile twice to update the ToC and bookmarks)
\documentclass{book} % or some other suitable document class
\usepackage{luacode} % for 'luacode*' environment
\begin{luacode*}
function strip_math ( u ) 
  -- Drop the '$' delimiters:
  v = u:sub  ( 2 , -2 ) 
  -- Three types of math directives that need to be modified:
      -- directives that need to be removed, e.g, \left and \biggr
      -- directives that need to be modified, e.g., \mid and \prime
      -- all others: just remove the leading backslash (\cos,\int,\log, ...)
  -- Remove all fence-sizing instructions from the input stream:
  v = v:gsub ("\\m?left" , "" ) 
  v = v:gsub ("\\m?right", "" )
  v = v:gsub ("\\[bB]igg?[lrm]?" , "" )
  -- Replace "\frac{...}{...}" with inline-fraction notation:
  v = v:gsub ("\\frac%s-(%b{})%s-(%b{})" , "(%1)/(%2)" ) 
  -- Delete '_' and '^' characters from input stream:
  v = v:gsub ("[%_%^]" , "" )   
  -- Change '\mid' to '|'
  v = v:gsub ("\\mid" , "|" )
  -- Change \prime to '
  v = v:gsub ("\\prime" , "'" )
  -- Finally, change '\int' to 'int', '\sum` to 'sum', '\det' to 'det', etc.
  v = v:gsub ("\\(%a+)", "%1" ) 
  -- Return a "\texorpdfstring" directive:
  return "\\texorpdfstring{"..u.."}{"..v.."}"
end

function fix_headers ( s ) s = s:gsub ( "(\%l-section[%*]?)%s-(%b{})" , function ( x , y ) -- Set aside all instances of "$" (if any): y = y:gsub ( "\%$", "@@@@@@@@" ) -- Apply 'strip_math' function if inline-math found: y = y:gsub ( "%b$$" , strip_math ) -- Restore instances of "$": y = y:gsub ( "@@@@@@@@" , "\$" ) return x..y end ) return s end

\end{luacode*}
%% Define a couple of utility LaTeX macros:
\newcommand\texorpdfOn{\directlua{luatexbase.add_to_callback(
  "process_input_buffer", fix_headers , "fix_headers" )}}
\newcommand\texorpdfOff{\directlua{luatexbase.remove_from_callback(
  "process_input_buffer", "fix_headers" )}}  

\usepackage{amsmath,mleftright}
\usepackage{unicode-math}
\setmainfont{Baskerville 10 Pro} % pick a suitable text font
\setmathfont{Asana Math}[Scale=MatchLowercase] % pick a suitable math font

\usepackage{xcolor}
\usepackage[colorlinks,allcolors=blue,linktocpage]{hyperref}

\begin{document}
\texorpdfOn % Activate the Lua function 'fix_headers'

\setcounter{secnumdepth}{3} % just for this example
\setcounter{tocdepth}{3}

\tableofcontents

\chapter{AAA}
\section{$\cos\left(  A+B\right)  $ \textcolor{red}{and} $\sin\left(  A+B\right)  $}
\subsection{$\det\bigl(A\bigr)$}
\subsubsection{$\ln \mleft[x\mright]$}
\subsubsection{$x^2+y^2=z^2$}
\subsection{$\int f(x)\,dx$}
\section{\textcolor{violet}{Hello World}}
\section{$\frac{a+b}{c+d}$ or $\frac{u}{v}$}
\subsection{$1+1+1=3$, and \$1+\$1+\$1=\textdollar3}
\subsection{Solve numerically the ODE $u''''+u=f$ using\dots}
\end{document}
Mico
  • 506,678
  • Thanks, I am trying your code. But I get error font is not there. `luaotfload | db : Reload initiated (formats: otf,ttf,ttc); reason: "Font Baskerville10Pro not found.". luaotfload | resolve : sequence of 3 lookups yielded nothing appropriate.

    ! Package fontspec Error: The font "Baskerville10Pro" cannot be found.

    For immediate help type H . ->\tex_errmessage:D Package fontspec Error: The font "Baskerville10Pro" cannot be found.` I can use what I had there before for font, but wanted to ask if this font is something I should have? I am using TL 2020 on Linux.

    – Nasser Jun 07 '20 at 10:03
  • @Nasser - Please just replace \setmainfont{Baskerville 10 Pro} with \usepackage{Baskervaldx} -- or whichever other text font package you may be interested in. – Mico Jun 07 '20 at 10:04
  • fyi, I am testing your code on large latex file, and for some reason it fails on section at line around 5,000 (it is very large latex file). Your code works on small test. I am trying to find out why it fails on that section header when it is in the large file, but not when it is in small file. Is there limit on how many sections your code can handle? I am thinking if there many sections and subsections in one file, some limit is reached? – Nasser Jun 07 '20 at 10:32
  • fyi `Package hyperref Warning: Token not allowed in a PDF string (Unicode): (hyperref) removing math shift on input line 4626. ! Improper alphabetic constant. \mitzeta l.4626 \section{Compare the effect on the step response of a standard second order system as $\zeta$ changes} ?` But this section causes no issue when I put it in in small Latex file using your code. – Nasser Jun 07 '20 at 10:34
  • @Nasser - I've tweaked/edited the Lua code shown above slightly. Please check if \section{Compare the effect on the step response of a standard second order system as $\zeta$ changes} still causes problems for you. (It doesn't for me.) – Mico Jun 07 '20 at 10:39
  • @Nasser - LuaTeX should have no built-in limit of the type you appear to be experiencing. Which version of LuaTeX do you employ, how much RAM is installed on your system, and when did you last perform cold reboot? Older versions of LuaTeX were known have some memory leakage issues, which could be "fixed" by rebooting the system. – Mico Jun 07 '20 at 10:48
  • 1
    I am still testing your code, I want to make sure I am using it correct, will update you once I am sure everything is setup OK and it is not a mistake on my end somehow. Thanks. Using latest TL 2020, lots of RAM 64 GB, etc...) – Nasser Jun 07 '20 at 10:50
  • FYI, I found one case where it fails. Please try this section \section{Solve numerically the ODE $u^{''''}+u=f$ using point collocation method} it gives error `(./foo3.out) ! Undefined control sequence. \g__um_prime_font_cmd_tl ->\l__um_font

    l.55 \section{Solve numerically the ODE \texorpdfstring{$u^{''''}+u=f$}{u{''''}+u=f} using point collocation method} ?` I will work around this one for now. But getting close to end of file now which is good.

    – Nasser Jun 07 '20 at 11:49
  • I think I found why it failed. Making one extra verification on it now..... – Nasser Jun 07 '20 at 12:05
  • @Nasser - Are you working with the latest version of my code? (I updated my answer most recently ca 45 minutes ago.) Also, do note that $u^{''''}$ is wrong: It should be $u''''$. (Or, for the OCD crowd: $u^{\prime\prime\prime\prime}$.) For sure, though, \section{Solve numerically the ODE $u''''+u=f$ using point collocation method} doesn't cause any problems on my end. – Mico Jun 07 '20 at 12:13
  • 1
    Yes, I am using your latest code. I found why it fails also. But I am doing few extra checks now. Will let you know in short time. Thank you. btw, $u^{''''}$ might be wrong, but it should not cause compile error. FYI, the build finished. File compiled OK. which is good! – Nasser Jun 07 '20 at 12:15
  • @Nasser - For more information on u' vs. u^{\prime}, see also difference between \prime and ' in math mode. For sure, u^{'} is simply not idiomatic. – Mico Jun 07 '20 at 12:19
  • 1
    All is OK now, I added note in my question about on the cause of the errors. Thank you,. – Nasser Jun 07 '20 at 12:31
  • @Nasser - Glad it's all working now. Many thanks for the "checkmark"! – Mico Jun 07 '20 at 12:42
6

The issue doesn't depend on the particular fonts, but on unicode-math.

Using \cprotect is not the solution: you have nothing verbatim in the titles.

You can incrementally collect the “problematic” commands:

\documentclass[12pt]{book}

\usepackage{unicode-math}
\defaultfontfeatures{Scale=MatchLowercase}
%\setmathfont{Asana Math}
%\usepackage{Baskervaldx}

\usepackage{amsmath}
\usepackage{hyperref}

\pdfstringdefDisableCommands{%
  \def\sin{sin}\def\cos{cos}% <-- add here
  \let\left\relax
  \let\right\relax
}

\begin{document}
\tableofcontents

\chapter{A}
\section{$\cos\left(  A+B\right)  $ and $\sin\left(  A+B\right)  $}%

\subsection{C}
stuff
\subsection{D}
stuff

\end{document}

enter image description here

egreg
  • 1,121,712
  • Thank you for the solution, but I wanted to keep same font I was using, and not do special custom handling for each case by case. But I appreciate your solution. – Nasser Jun 07 '20 at 10:05
  • @Nasser If you comment out the two lines related to the font, you get no issue. – egreg Jun 07 '20 at 10:11
  • @Nasser egreg's solution to use \pdfstringdefDisableCommands is what I would recommend too. It is a bit work until you identified all problematic cases but when you have done it, it will be stable. And you can decide for each case which output in the bookmarks you want. – Ulrike Fischer Jun 07 '20 at 10:27
  • The problem I can't select all math cases, I have literally 100's of thousands of such sections. Generated by programs from CAS. So I can't sit and find each math entry in each section. So I am afraid this will not work for me. But thanks again. – Nasser Jun 07 '20 at 10:31
  • 1
    @Nasser the question is not how many sections you have but how many different math commands in this sections. Also with Mico's code you can't be sure that it catched every case correctly and so will have to check that his replacement gives the right output in the bookmarks. – Ulrike Fischer Jun 07 '20 at 13:06
  • 1
    @UlrikeFischer - To help me make sure I understand your comment, would you mind clarifying the meaning of "with Mico's code you can't be sure that it [caught] every case correctly". AFAICT, there are three distinct types of substitution rules: some commands need to be dropped entirely from the bookmark (e.g, \left, \Biggr); some commands need a change in appearance (e.g., \prime -> ' and \mid -> |); and all other math macros can be rendered as the names themselves minus the leading backslash (\int -> int, \sum -> sum, etc.) Did I miss more (types of) commands? – Mico Jun 07 '20 at 18:56
  • 1
    @Mico no idea, that's why I said it must be checked. I mean I neither now which math commands Nasser is using not which are actually problematic in bookmarks/with hyperref. – Ulrike Fischer Jun 07 '20 at 20:03
3

The problem with \cos and \sin can be resolved by making \operator@font robust. I opened an issue for it at the unicode-math github https://github.com/wspr/unicode-math/issues/550

This resolves one problem, it does not mean that every math will work without error.

\documentclass[12pt]{book}

\usepackage{unicode-math}
\setmathfont{Asana Math}
\usepackage{hyperref}
\makeatletter
\ExplSyntaxOn
\cs_set_protected:Npn \operator@font
  {
    \__um_switch_to:n {literal}
    \__um_fontswitch:n { \g__um_operator_mathfont_tl }
  }
\ExplSyntaxOff
\makeatother
\begin{document}
\tableofcontents

\chapter{A}
\section{$\cos\left(  A+B\right)  $ and $\sin\left(  A+B\right)  $}%


\end{document}

Problems with \zeta and similar can be avoid by loading hyperref with the psdextra option:

\documentclass{article}
\usepackage{unicode-math}
\usepackage[psdextra]{hyperref}

\begin{document}

\section{$\zeta$}

\end{document}
Ulrike Fischer
  • 327,261