Bug in xelatex + unicode+math + vphantom

Question

This question is a simplified version of a slightly more complicated problem with XITS Math: Consider the following code:

\documentclass{article}
\usepackage{unicode-math}
\begin{document}
\(|^a\) \(\vphantom{|}^a\)

\(|^a\) \(\mathord{\vphantom{|}}^a\)

\(|^a\) \(\mbox{\vphantom{$|$}}^a\)
\end{document}

In the first line, the vertical positions of the two occurrences of 'a' are the same when compiling with lualatex (which is expected) but not when compiling with xelatex (which I consider a bug). The second and third lines represent unsuccessful attempts to repair. Any hints?

A box can never be the nucleus for a math atom, so it cannot receive superscripts. You have to brace the \vphantom. — egreg, Feb 26 '16 at 21:08
Additionally, there are fontdimen scaling problems in XeTeX, see https://tex.stackexchange.com/questions/281549/why-is-the-fraction-off-the-math-axis-in-xetex — Henri Menke, Feb 26 '16 at 21:19
@egreg: \mbox{{\vphantom{$|$}}}^a gives the same result as \mbox{\vphantom{$|$}}^a for me. — Frank Feuerstern, Feb 26 '16 at 21:21
@FrankFeuerstern It should be {\mbox{\vphantom{$|$}}}^a or just {\vphantom{$|$}}^a. — Henri Menke, Feb 26 '16 at 21:26
@Henri Meike: {\mbox{\vphantom{$|$}}}^a still gives the same result for me. {\vphantom{$|$}}^a doesn't compile. — Frank Feuerstern, Feb 26 '16 at 21:28
@egreg: “A box can never be the nucleus for a math atom”??? — GuM, Feb 26 '16 at 21:55
@GustavoMezzetti Right, a \phantom can't. Try $\displaystyle\vphantom{\sum}^x\sum\nolimits^x$\bye — egreg, Feb 26 '16 at 22:01
@egreg: So you actually meant: A \mathchoice can never be the nuclues of a math atom, so it cannot receive superscripts. This is true. — GuM, Feb 26 '16 at 22:05
@GustavoMezzetti Yes, that's the thing! Never do things when you are also coping more important ones. ;-) I can confirm my opinion that \mathchoice is the most unfortunate aspect of TeX. — egreg, Feb 26 '16 at 22:09
With amsmath instead of unicode-math, the first line works great. — Frank Feuerstern, Feb 26 '16 at 22:49
https://sourceforge.net/p/xetex/bugs/87/ https://github.com/wspr/unicode-math/issues/267 — Frank Feuerstern, Feb 26 '16 at 23:15
The two math lists produced by $\vphantom{|}^a$ and by $\mathchoice{\setbox2=\null \ht2=7.5pt \dp2=2.5pt \box2}{\setbox2=\null \ht2=7.5pt \dp2=2.5pt \box2}{\setbox2=\null \ht2=5.24998pt \dp2=1.75pt \box2}{\setbox2=\null \ht2=3.75pt \dp2=1.25pt \box2}^a$ seem identical, yet their translation to a horizontal list is different (but only with unicode-math): quite puzzling… :-/ — GuM, Feb 26 '16 at 23:40
@Gustavo Mezzetti : As of today, I'm afraid I'm done: unfortunately, I do know know enough of TeX to understand or debug your code. — Frank Feuerstern, Feb 27 '16 at 00:02
@GustavoMezzetti I think the difference is that in the first case a math symbol is typeset (for building \box0 that's later used for setting the dimensions of \box2). — egreg, Feb 27 '16 at 00:43
@egreg: But it is typeset while constructing the math list: how can this influence the conversion from math list to horizontal list? In any case, also adding \setbox0=\hbox{${|}$} to all four branches of my hand-crafted \mathchoice doesn’t change the output. But I think it’s enough for now: let us think again about this mystery tomorrow. EDIT: Just found it: add \setbox0=\hbox{$\scriptscriptstyle{|}$} to the last branch and you’ll see! But how can this happen? — GuM, Feb 27 '16 at 01:34
I'm voting to close this question as off-topic because the bug is fixed in TL2017. — Henri Menke, Jun 11 '17 at 01:21
@HenriMenke it looks as if the problem is not completly resolved. Superscripts works, but subscripts can still be shifted slightly differently. See https://tex.stackexchange.com/a/374902/2388 or try in Gustavo Mezzetti's answer _t instead of ^a. — Ulrike Fischer, Jun 14 '17 at 16:00
@UlrikeFischer Indeed, that is kind of disappointing. Well, the future belongs to LuaTeX then I guess. I would leave this question closed because it is explicitly about superscripts and these seem to be fixed. — Henri Menke, Jun 14 '17 at 21:25

GuM · Answer 1 · 2016-02-27T12:59:01.747

I am making this an answer only because it doesn’t fit in a comment: It looks like XeTeX “remembers” the math style that was in force when it last typeset a symbol inside a formula, and continues to apply it for choosing certain \fontdimen parameters in subsequent formulas as long as they begin with an atom whose nucleus is empty (or contains an empty math list, which has the same effect). Consider

\documentclass[a4paper]{article}
\usepackage{unicode-math}


\begin{document}
Just try:
$^a$ $\scriptscriptstyle x$ $^a$ $\mathchoice{}{}{}{}^a$ $x^a$ $^a$
\end{document}

whose output is

However, this only happens when the unicode-math package is loaded.

This error seems to occur during the process of transforming a math list into its horizontal equivalent, as if XeTeX failed to initialize the “current style” in some situations, or more precisely some information pertaining to the choice of the \fontdimen parameters involved in typesetting the superscript. But this is only a guess (after all, the question asks for “any hints”).

Where does the behavior described in the question come from, then? Well, \vphantom{|}, when used in math mode, invokes a primitive command of TeX called \mathchoice, which—to put it simply—typesets the “|” character in all the four math styles (\displaystyle, \textstyle, \scriptstyle, and \scriptscriptstyle) in order to figure out how much tall the phantom should be in each of the four styles. Thus, the last thing \vphantom{|} does is to typeset $\scriptscriptstyle |$ , and when the closing $ is sensed, this formula is translated into a horizontal list (and saved into a temporary box, but this doesn’t matter here); if our guess is correct, this leaves in some place the information that the \fontdimen parameters should be chosen for a “current style” equal to \scriptscriptstyle. Then, (Xe)TeX returns to the (still empty) math list that it was constructing when it saw the \vphantom{|}, and appends a “four-way choice” node to it, containing the four versions of the phantom, one for each of the possible styles.

Next comes the ^a. Now, a fact that seems to be easily forgotten is that a math list can contain several kinds of nodes (The TeXbook lists them all at the bottom of page 157), but that only atoms can carry sub/superscripts; a “four-way choice” node can’t, so (page 291, description of the “superscript” command) an Ord atom with an empty nucleus is appended to the list, and the “a” becomes it superscript. At this point the math list looks like this:

\mathchoice
D\mathord
D.\hbox(7.5+2.5)x0.0
T\mathord
T.\hbox(7.5+2.5)x0.0
S\mathord
S.\hbox(5.24998+1.75)x0.0
s\mathord
s.\hbox(3.75+1.25)x0.0
\mathord
^\fam0

This list is then converted to its horizontal equivalent. According to our hypthesis, XeTeX still has some “dirty” variable saying that the \fontdimen parameters should be those of \scriptscriptstyle; at this point, everything goes as in the case of $\mathchoice{}{}{}{}^a$ in our code, above: for some reason, the “four-way choice” node containing just empty boxes does not reset the \fontdimen information, and, according to Rule 4 of Appendix G, leaves just an Ord atom without sub/superscript, which is processed by Rules 14, 17, and 18, yielding simply an empty box of some (unimportant) height; then comes the Ord atom with empty nucleus, and Rule 14 just passes it over to Rule 17, that in turn, being the nucleus empty, goes on to Rule 18, which is the rule that takes care of positioning the sub/superscript. Thus we see that, in our hypotheses, Rule 18 is applied while the \fontdimen information is still “dirty”.

(Deep breath.)

This is essentially the same bug as this: https://tex.stackexchange.com/questions/281549/why-is-the-fraction-off-the-math-axis-in-xetex — Henri Menke, Feb 27 '16 at 08:48
Alas, I understand zero from your remarks---which is completely due to my ignorance. I guess, if you could push xetex developers in https://sourceforge.net/p/xetex/bugs/87 a bit, one might have a slightly higher chance of repairing. But so far, I apologize a lot for not being able to provide any useful information. — Frank Feuerstern, Feb 27 '16 at 14:37

Bug in xelatex + unicode+math + vphantom

1 Answers1

Linked