4

I'm using LuaLaTeX instead of XeLaTeX for my book, mainly because it supports microtype's expansion feature, to produce the optimal word spacing and line breaking that will result in the most uniform textspace coverage while minimizing hyphenation. A problem I'm finding is that LaTeX doesn't seem to understand that an emdash without spaces (which I insert directly into the .tex document as the Unicode character — ) provides just as good a spot for ending a line as whitespace does, and it will do whatever it has to to avoid ending a line immediately before or after an emdash, even where that would be the optimal spot. I've tried replacing the Unicode with \textemdash, but this had no effect. XeTeX doesn't have this problem. Is there any way to explain to LuaTeX what XeTeX seems to understand? To illustrate:

\documentclass[letterpaper,12pt,onecolumn,final]{memoir}
\usepackage{luatextra}
% fontspec is loaded by luatextra in LuaLaTex             
\usepackage{xparse}
\usepackage{polyglossia}
\setdefaultlanguage{english}
\usepackage[final=true]{microtype}
\usepackage[showframe]{geometry}

\setmainfont{Linux Libertine O}

\begin{document}

So, with dash and verve, he sang: “I am the very model of a modern major general

Now, adding a little more more \emph{dash} to his verve\ldots

So, with dash and verve, he sang: “I am the very model of a modern major general—

This brings the paragraph to just a hair shy of the end of the line. No font expansion has been done by 
microtype, because it fits perfectly. This would then be an optimal position for a line break, were the 
paragraph to continue onto the next line. Does it do that? Let’s see.

So, with dash and verve, he sang: “I am the very model of a modern major general—I’ve information 
vegetable animal and mineral.”

As you can see, microtype negatively expands (in other words, compresses) the first line to avoid the 
optimal break point, if only it knew!

\end{document}

2 Answers2

3

Since posting this question, I have discovered another question that deals with aspects of this issue in a non-microtype context. The accepted answer there, by topskip (code modified by him or her based on commented suggestions by egreg) suggests the following TeX "hack":

\catcode`\—=13
\protected\def—{\unskip\nobreak\thinspace\textemdash\allowbreak\thinspace\ignorespaces}

When added to the preamble, I find that this does work, allowing line breaking following (but not preceding) the dash. It does have the side-effect "bookending" all of one's emdadashes with (thin) spaces, which may not be desired. In that case, I find that adding \negthinspace after each \thinspace will negate those, resulting in a normal-looking unspaced breakable emdash:

\catcode`\—=13
\protected\def—{\unskip\nobreak\thinspace\negthinspace\textemdash\allowbreak\thinspace\negthinspace\ignorespaces}

It occurred to me that this should render the first pair superfluous, but I found that removing the \nobreak\thinspace\negthinspace there resulted in the dash being moved backward to a point where it actually joined with the l at the end of general—I have no idea why.

UPDATE: A simpler form of the code is developed in the comments to this answer.

  • The other question did not pop up in the list of suggested possible related/duplicate questions as I was typing in the title, or I would have not posted this as a separate question, but would have simply added my negthinspace refinement there (if I had thought of it at all) as a comment, which I have also now done. – Stonefeather Grubbs Dec 15 '13 at 06:11
  • 2
    A quick try with \protected\def—{\allowbreak\textemdash\allowbreak} allows breaking before the dash (which it indeed doesn't witouth this code). I threw out all the commands dealing with spaces since you indicated you wanted none. – Christoph B. Dec 16 '13 at 10:55
  • @Christoph This is now exactly what I was looking for! (I'd thought the spaces had been added because LuaTeX needed them for it to work). Of course, since many style authorities feel that a dash should not begin a line, if one wishes to adhere to this rule, the line should take the form \protected\def—{\nobreak\textemdash\allowbreak}. Note that this line must still be preceded by \catcode\—=13` or it won't work. – Stonefeather Grubbs Dec 17 '13 at 04:03
  • You can just drop the \thinspace and \negthinspace because they counter each other. – morbusg Dec 17 '13 at 09:24
3

I ended up with a slight alternative to Stonefeather Grubbs's answer, using the fact that LaTeX's default --- behavior already handles this by allowing line breaks after the em dash but generally not before:

\catcode`\—=13
\protected\def—{---}