3

I am trying to deal with an issue with rendering unicode characters (including combining characters) inside a verbatim environment. I am using xetex as my compilation engine. (For more context see this github issue).

I have found that the default lmtt does not seem to support β, , or β̂ if they are encoded literally in a verbatim block.

I have discovered that Source Code Pro has support for these characters, but it seems to only be accessible from certain manners of loading it.

In particular, if I try to use

\usepackage{sourcecodepro}

it successfully displays the characters, though the combining circumflex above the β fails to align properly.

Because I'm already using fontspec to get access to other unicode supporting fonts elsewhere, I'd prefer to use fontspec's utilities directly. But I find that when instead I use:

\usepackage{fontspec}
\setmonofont{Source Code Pro}

β's do not display (though the combining circumflexes do display).

MWEs:

SourcecodePro

\documentclass[11pt]{article}
\usepackage{sourcecodepro}
\begin{document}
\begin{verbatim}
β 
â 
β̂
\end{verbatim}
\end{document}

fontspec + setmonofont

\documentclass[11pt]{article}
\usepackage{fontspec}
\setmonofont{Source Code Pro}
\begin{document}
\begin{verbatim}
β 
â 
β̂
\end{verbatim}
\end{document}

Any insight as to why this occurs would be greatly appreciated.

Image for illustration purposes: Different rendering behaviour depending on which is loaded

Per request: List files from testme1.tex (on the left)

article.cls 2014/09/29 v1.4h Standard LaTeX document class size11.clo 2014/09/29 v1.4h Standard LaTeX file (size option) fontspec.sty 2016/02/01 v2.5a Font selection for XeLaTeX and LuaLaTeX expl3.sty 2016/05/18 v6512 L3 programming layer (loader) expl3-code.tex 2016/05/18 v6512 L3 programming layer l3xdvipdfmx.def
xparse.sty 2016/05/18 v6512 L3 Experimental document command parser fontspec-xetex.sty 2016/02/01 v2.5a Font selection for XeLaTeX and LuaLaTeX fontenc.sty eu1enc.def 2010/05/27 v0.1h Experimental Unicode font encodings eu1lmr.fd 2009/10/30 v1.6 Font defs for Latin Modern xunicode.sty 2011/09/09 v0.981 provides access to latin accents and many oth er characters in Unicode lower plane eu1lmss.fd 2009/10/30 v1.6 Font defs for Latin Modern graphicx.sty 2014/10/28 v1.0g Enhanced LaTeX Graphics (DPC,SPQR) keyval.sty 2014/10/28 v1.15 key=value parser (DPC) graphics.sty 2016/05/09 v1.0r Standard LaTeX Graphics (DPC,SPQR) trig.sty 2016/01/03 v1.10 sin cos tan (DPC) graphics.cfg 2016/01/02 v1.10 sample graphics configuration xetex.def 2016/04/06 v4.08 LaTeX color/graphics driver for XeTeX (TeX Liv e/RRM/JK) infwarerr.sty 2016/05/16 v1.4 Providing info/warning/error messages (HO) ltxcmds.sty 2016/05/16 v1.23 LaTeX kernel commands for general use (HO) fontspec.cfg t3cmr.fd 2001/12/31 TIPA font definitions
List files from testme2.tex (on the right)

article.cls 2014/09/29 v1.4h Standard LaTeX document class size11.clo 2014/09/29 v1.4h Standard LaTeX file (size option) sourcecodepro.sty 2015/10/09 v2.6 Adobe's Source Code Pro typeface ifxetex.sty 2010/09/12 v0.6 Provides ifxetex conditional ifluatex.sty 2016/05/16 v1.4 Provides the ifluatex switch (HO) xkeyval.sty 2014/12/03 v2.7a package option processing (HA) xkeyval.tex 2014/12/03 v2.7a key=value parser (HA) fontspec.sty 2016/02/01 v2.5a Font selection for XeLaTeX and LuaLaTeX expl3.sty 2016/05/18 v6512 L3 programming layer (loader) expl3-code.tex 2016/05/18 v6512 L3 programming layer l3xdvipdfmx.def
xparse.sty 2016/05/18 v6512 L3 Experimental document command parser fontspec-xetex.sty 2016/02/01 v2.5a Font selection for XeLaTeX and LuaLaTeX fontenc.sty eu1enc.def 2010/05/27 v0.1h Experimental Unicode font encodings eu1lmr.fd 2009/10/30 v1.6 Font defs for Latin Modern xunicode.sty 2011/09/09 v0.981 provides access to latin accents and many oth er characters in Unicode lower plane eu1lmss.fd 2009/10/30 v1.6 Font defs for Latin Modern graphicx.sty 2014/10/28 v1.0g Enhanced LaTeX Graphics (DPC,SPQR) graphics.sty 2016/05/09 v1.0r Standard LaTeX Graphics (DPC,SPQR) trig.sty 2016/01/03 v1.10 sin cos tan (DPC) graphics.cfg 2016/01/02 v1.10 sample graphics configuration xetex.def 2016/04/06 v4.08 LaTeX color/graphics driver for XeTeX (TeX Liv e/RRM/JK) infwarerr.sty 2016/05/16 v1.4 Providing info/warning/error messages (HO) ltxcmds.sty 2016/05/16 v1.23 LaTeX kernel commands for general use (HO) fontspec.cfg t3cmr.fd 2001/12/31 TIPA font definitions

mpacer
  • 829
  • You're using fontspec either way. Just you get a tailored configuration with the package. – cfr Feb 22 '17 at 04:03
  • I get exactly the same result with either code: β is OK, but there's no accented glyph and composition doesn't work. (But I'm also not sure how to type this correctly, so I don't know whether it is the input that's wrong.) – cfr Feb 22 '17 at 04:09
  • What is the tailored configuration that is included with the package? What OS are you using (I'm on OS X). I can link to images of the rendered pdf, but I assure you it behaves differently. – mpacer Feb 22 '17 at 04:11
  • I am not claiming it does not behave differently. I'm just saying that the differences you've posted don't explain it, as the two cases produce the same result here. We could have different versions of something. You might have a config file or something in your personal TEXMF tree. (I am compiling in effect without my personal tree, so I know I don't have one.) Versions are most likely the cause of the difference. Either the font or something else. – cfr Feb 22 '17 at 04:16
  • How do I compile without my personal tree in order to test this? – mpacer Feb 22 '17 at 04:17
  • I just use TEXMFHOME=/d xelatex <filename>. Just so long as /d does not exist on your system. So if you actually have a directory /d, pick something else. – cfr Feb 22 '17 at 04:22
  • But I'm betting on version differences. Try putting \listfiles before \documentclass and post the results for the left hand side of the image you showed. (The right side is the one I get either way.) – cfr Feb 22 '17 at 04:23
  • Ok, I just used TEXMFHOME=/d xelatex testme1.tex and TEXMFHOME=/d xelatex testme2.tex (nb: I have no /d directory) and got the same behaviour as before. I included \listfiles before \documentclass, and it didn't change anything in the output pdf. Should I be looking in the log file? – mpacer Feb 22 '17 at 04:31
  • 2
    I don't see any obvious difference but you should update your packages anyway, your logs show the older EU1 encoding setup instead of the TU encoding that has been the default since the 2017/01/01 latex release. – David Carlisle Feb 22 '17 at 07:50
  • 2
    You probably have two versions of the font. Add \XeTeXtracingfonts=1 and then check in the log-file which fonts are actually used. Regarding the displaced accent: Be aware that verbatim shows a "verbatim" output and so this look is expected. – Ulrike Fischer Feb 22 '17 at 08:18
  • If I try both versions, I get the same output; by the way, the combining circumflex after beta gets misplaced. – egreg Feb 22 '17 at 09:33
  • @UlrikeFischer That doesn't entirely make sense. It's supposed to be a combining character, and it does combine correctly with the a. Any ideas? Honestly this is such an improvement that I'm happy to just see this much. Should I open a new question about that though? This seems to be out of the question scope. – mpacer Feb 23 '17 at 01:29

1 Answers1

6

@UlrikeFischer In a comment helped identify the solution.

The issue arises if you have a separately installed Source Code Pro (from Adobe Type Manager) as well as the Source Code Pro that you also have as part of the TeX distribution.

If you declare the font using \setmonofont and Source Code Pro it will find the Source Code Pro that is installed from the Adobe Type Manager, which appears to have inferior Unicode support in contrast to the TeX-included version.

If you want to explicitly specify the TeX version of Source Code Pro using fontspec, you will need to instead use the following declaration:

\usepackage{fontspec}
\setmonofont{SourceCodePro-Regular.otf}

If you don't include the .otf it will attempt to load the ttf which may produce other problems.

However, that won't give you the same behaviour as just using \usepackage{sourcecodepro}.

The problem is that with that command, you only get the upright version of the font, meaning italics, bold and bold italics will not work.

If you want instead to replicate that font family coverage you would need to use:

\usepackage{fontspec}
\setmonofont[Extension=.otf,UprightFont =*-Regular,ItalicFont =*-RegularIt,
BoldFont=*-Bold,BoldItalicFont=*-BoldIt]{SourceCodePro}

where the star fills in with the value of the font, and the extension is specified by the Extension option.

If you wanted to fully replicate the behaviour of sourcecodepro (as included by default) you would need to specify a few more options:

\usepackage{fontspec}
\setmonofont[Ligatures = TeX,Numbers =, Scale = 1,Extension = .otf, 
WordSpace = {1,0,0}, PunctuationSpace = WordSpace, UprightFont =*-
Regular,ItalicFont =*-RegularIt, BoldFont=*-Bold,BoldItalicFont=*-BoldIt] 
{SourceCodePro}

which will then fully recreate the behaviour of \usepackage{sourcecodepro}.

If you are running into a similar problem for a different font, use the \XeTeXtracingfonts=1 and \listfiles commands in your document, and look at your *.log file associated with compiling your document.

A related helpful debugging trick, if you want to see what font is actually being used at any point in your document, you can use a trick from test current font:

\makeatletter
\newcommand{\showfont}{encoding: \f@encoding{},
  family: \f@family{},
  series: \f@series{},
  shape: \f@shape{},
  size: \f@size{}
}
\makeatother

and then wherever you want to see the current font, you include the command \showfont. It will display the font information in the document at that point.

NB: this doesn't address the fact that the combining character is not correctly combining with β but it is correctly combining with a. But, that is for a different question.

mpacer
  • 829
  • â is probably a single glyph in the font. It doesn't need to be composed. – cfr Feb 23 '17 at 02:08
  • @cfr, but it was created as a single character plus the combining character, does xetex do automatic normalization (NFC or NFKC)? – mpacer Feb 23 '17 at 19:40
  • I have no idea what that means, but I'm assuming it is one unicode character in the input stream. Are sure that it is fed to TeX as distinct characters? Normally, accented characters (even with TeX or pdfTeX) will use pre-composed glyphs, when available. With unicode fonts, this typically will mean almost all of them, if the font supports the relevant script. – cfr Feb 23 '17 at 22:21
  • It is not one unicode character in the input stream. I created the "character" using the a "character" followed by the ̂ circumflex combining character (\u0302)(), not the standalone circumflex accent character ^ (\u0053) or the modifying letter circumflex accent (ˆ) (\u02c6). Thus, in order for it to treat â as a single glyph (and not as a sequence of two characters as expressed in the input stream), it would need to "normalise" the combined unicode character sequence into the single character that it maps onto. NFC and NFKC are two schemes for accomplishing this normalisation. – mpacer Feb 23 '17 at 23:32
  • Not necessarily. It depends how your OS handles that input and whether your editor affects it. It doesn't follow that XeTeX does anything, even if the anything is done. – cfr Feb 24 '17 at 00:41
  • Hi, I just checked (by copying the characters directly from the output pdf), there's no normalization, it's treating the as a combined character made of a and ̂ (not as â). Thus, it does not explain why it would combine in one case, but not the other. Additionally, if you copy and paste the resulting β glyph along with the circumflex that "follows" it, it will combine appropriately if you place it in an environment in which that is possible(in this case in a live jupyter notebook). Should I open this as a new question? – mpacer Feb 27 '17 at 19:01
  • @mpacer Thanks, your answer helped me sort out an issue related to the difference in the way double quotes are displayed (straight or curly) when using the different versions of the Source Code Pro font. – Steve Mar 04 '19 at 10:23
  • Using Noto Serif as the roman font, Noto Sans as the sans font, and Noto Sans Mono as the teletype font (\setmainfont{Noto Serif}%Source Code Pro}\setsansfont{Noto Sans}\setmonofont{Noto Sans Mono}), everything displays perfectly (βâβ̂ \texttt{βâβ̂} \sffamily βâβ̂). The issue is with the font. – Cicada Jan 22 '20 at 12:31
  • Looking at sourcecodepro-regular.ttf in FontForge shows that the combining diacritics are defined for everything except sharp s. Even Ӑ. (in lookup table 20, even various Greek and Cyrillic letters). – Cicada Jan 22 '20 at 12:54