119

DEK gained a reputation of painting red the draft dissertations of his students, taking particular note of incidents when they omitted a non-breaking space that should have been included.

I have this list of places where you have to place non-breaking space:

  1. before \cite
  2. before \ref
  3. before inlined equation

But, I am not sure if I got the rules right; is a non-breaking space mandatory before all inline equations? How about numbers which occur in the text? And what if I refer to a program variable, or to program text, for example,

The 371 programmers who read, on 11 different occasions, the
Java program in Figure~\ref{Program:Example}  noticed that it is peculiar since
parameter \texttt{i} is never read by functions \texttt{f()}
and \texttt{thisLongFuncgtionName()}...

Do I have to write The 371 programmers or The~371 programmers? on 11 different occasions or on~11 different occasions?

Do I need to write parameter~\texttt{i}? I think I should. What about functions~\texttt{f()}? And should I write and~\texttt{thisLongFuncgtionName()}? How about citations that use author, year convention?

In short, I think have an idea, but no exact definition of when you should add non-breaking space.

lockstep
  • 250,273
Yossi Gil
  • 15,951

7 Answers7

99

To quote Knuth, ties should appear:

  • In references to named parts of a document:
    Chapter~12   Theorem~1.2
    Appendix~A   Table~\hbox{B-8}
    Figure~3   Lemmas 5 and~6.
  • Between a person's forenames and between multiple surnames:
    Donald~E. Knuth   Luis~I. Trabb~Pardo
    Bartel~Leendert van~der~Waerden   Charles~XII
    but be careful of names like Charles Louis Xavier~Joseph de~la Vall\'ee~Poussin.
  • Between math symbols in apposition with nouns:
    dimension~$d$ width~$w$ function~$f(x)$
    string~$s$ of length~$l$~
    but compare with
    string~$s$ of length $l$~or more.
  • Between symbols in series:
    1,~2, or~3
    $a$,~$b$, and~$c$
    1,~2, \dots,~$n$.
  • When a symbol is a tightly bound object of a preposition:
    of~$x$
    from 0 to~1
    increase $z$ by~1
    in common with~$m$. but compare
    of $u$~and~$v$.
  • When mathematical phrases are rendered in words:
    equals~$n$ less than~$\epsilon$ (given~$X$)
    mod~2 modulo~$p^e$ for all large~$n$
    Compare is~15 with is 15~times the height.
  • When cases are being enumerated within a paragraph:
    (b)~Show that $f(x)$ is (1)~continuous; (2)~bounded..
geras
  • 213
TH.
  • 62,639
  • 1
    This is almost a duplicate of my answer here! – Lev Bishop Apr 11 '11 at 20:43
  • 2
    Indeed, and if may, I would like to join a comment made there: "I must be really dense, I can't seem to figure out any "rules" from these examples; these look quite arbitrary to me. Perhaps I should bite the bullet and read the chapter. Thanks for the pointer. – gphilip Aug 16 '10 at 4:23" – Yossi Gil Apr 12 '11 at 04:18
  • 4
    • between a number and its unit: 4.5~m or 12~min – Florian Oct 03 '11 at 10:35
  • 5
    @Florian: Probably best to use something like siunitx for that. It'll get the formatting right. ~ gives a space that's probably too large. – TH. Oct 05 '11 at 00:34
  • @TH I share the same view. Up until I started using siunitx, I used \, before a unit. Imho a protected space is too much off. – henry Apr 07 '14 at 07:44
  • 1
    Found it! Knuth and Plass, Breaking Paragraphs into Lines, Software - Practice and Experience, 1981, pp. 1135--1137. Or in the Digital Typography of Knuth (however, I don't have access to the latter, so didn't check). – adn Apr 29 '15 at 13:01
43

In general where the break will create orphans that would distract the reader.

Some less obvious examples:

I~am
I~definitely
mod~1


The matching $(AW,BX,CY,DZ)$ is unstable, because for example
$A$ prefers~$Z$ to~$W$ and at the same time $Z$ prefers~$A$ to~$D$.
But the matching $(AZ,BW,CX,DY)$ is stable;

(we say that girl~$h$ rejects the proposal)

step~A2 stops when $P$ has nobody left to propose~to,
but step~B2 keeps making redundant proposals ad~infinitum when

The details of Algorithm~B

has local probability~${1\over n}$,

The "I am", "I definitely" etc., is a bit controversial, but personally like a lot of other people don't like "I" at the end of a line break.

yannisl
  • 117,160
19

In languages like German where spaced en dashes are used for sentence insertions, it is often frowned upon to place the en dash at the start of a new line. So, a non-breaking space should be placed before the dash.

Dieses Mal~-- anders als vorher~-- wurde er überrascht.

This time---unlike before---he was caught by surprise.
lockstep
  • 250,273
  • 1
    Interesting. In French, I tend to put the ~ inside, to prevent having a -- at the end of a line (it's like opening a bracket at the end of a line really). – raphink Oct 02 '11 at 22:09
  • 4
    Could you cite a source for this? Personally I tend to the "French" rule, whenever the en-dashes are used parenthetically: Dieses Mal --~anders als vorher~-- wurde er überrascht. but I keep the dash together with the first part of the sentence, when it could be replaced by a colon or full stop: Hans kam plötzlich herein~-- dieses Mal waren alle überrascht. But I don't know where I learned this and whether it is correct. – Florian Oct 03 '11 at 10:29
  • 1
    No explicit source -- but in a lot of German books I've read, the rule of "no en dash at the start of a line" seems to have been observed. – lockstep Oct 03 '11 at 15:58
  • @lockstep I cannot confirm this rule for German either, but then I haven't checked recently. – Lover of Structure Feb 23 '13 at 13:46
  • 10
    The German Wikipedia page about en-dashes says that you should use a non-breaking space near the part of the sentence that gets separated – which basically means the same thing (“Der Gedankenstrich ist immer durch ein Leerzeichen vom umgebenden Text zu trennen. Dabei ist in der Textverarbeitung zu beachten, den Gedankenstrich mit einem geschützten Leerzeichen an den Satzteil zu binden, den er abgrenzt.”) – Rafael Bugajewski Mar 17 '14 at 14:50
9
The 371~programmers who read, on 11~different occasions, the 
Java program in Figure~\ref{Program:Example} noticed that it is peculiar since 
parameter~\texttt{i} is never read by functions~\texttt{f()} 
and \texttt{thisLongFuncgtionName()}...

That's my take and I put it up for debate. In short, I'd use a "tie" (~) whenever a line break would split a unit of thought. See also Why I should put a ~ before \ref command? Edit: the tie before f() is debatable and probably should be exchanged for a tie before thisLongFunctionName(), which might cause hyphenation problems.

2

I ask myself "are these two things intimately connected?" when using non-breaking spaces. Most of the examples in the other posts in this thread have a definite "yes" as the answer, like (number)~(things) or Equation~\eqref{eqn:NewtonSecondLaw} or (number)~(unit) (side note: use siunitx package for quantities with units) or acceleration vector~$\bm{a}$.

Math, be it inline or display, should be punctuated as if it were not symbols but actual words. The amsmath package already adjusts spacing according to the "part of speech" of the math you are typesetting. Not all inline math should have non-breaking spaces. For example: after a display equation, it is common to write where $x$ is the independent variable and $c$ is a constant, and in both of these cases the non-breaking space is not appropriate.

I would add to the list of "things to not break" multi-word nouns and verbs. English has articles, demonstrative pronouns, and multi-word infinitive verbs, all of which I feel should be tied to their neighbors. I write Johnny kicked the~ball. and This~server has an uptime of 99\%. and To~go boldly where no man has gone before. This practice helps me to avoid dangling demonstratives (this/that/these/those should be followed by a noun) and splitting infinitive verbs. My objection to doing so is not out of grammar nazism, but as someone who writes technical literature, I am being nice to my audience; most of my readers are not native speakers and the split infinitive is not found outside of English.

It is easy to overdo it with the non-breaking spaces, though, like The~371~programmers in the original post. It all depends on how the LaTeX compiler decides to justify each paragraph (side note: the microtype package really helps). Personally, I don't like a lot of hyphens, so I would use The 371~programmers if the other way gave a hyphen in programmers. I almost need a non-breaking space and try-not-to-break-it-but-its-ok-if-you-have-to space. Note well: messing with breaks and hyphens is my very last step in document creation.

I don't have any sources here. Bringhurst's "Elements of Typographic Style" has a few points about hyphens and breaks IIRC, and you owe it to yourself to read this bible of typography if you know enough about typography to even ask this question.

  • I would also tie uptime of~99\% and even better use siunitx: \SI{99}{\percent} to also get spacing per SI rules between number and percent sign. – lblb Apr 11 '18 at 19:50
2

I may add yet another case where latin contractions are used. (I think this is the case, although I am not sure. Please feel free to correct me if I am wrong.)

Common examples include 1) i.e. , 2) e.g. and 3) viz . for instance, it feels rather odd to have i.e. appearing at the end of a line in a paragraph.

An Analog to Digital Converter i.e.~a circuit capable of sampling analog signals and producing discret-time equivalents was designed blah blah ...

It is questionable to use contractions e.g.~do not use the word "don't" in your documents.

The study was performed through three different routes~viz. assertation of the suprememum norm in the Eucleadian space, and blah blah ....

Please feel free to disagree with these cases, and edit this answer so that I may also improve my understanding of the use of non-breaking spaces.

  • Shouldn't there be a comma before and after "e.g."? If this is the case, then line break is ok. Not sure about viz. though – Yossi Gil Dec 20 '18 at 16:17
0

I found that with lots of German text (over 1000 pages, and German is important here because it has more long compound words than English), the habit of using ~ sooner or later leads to

  • unnecessary word hyphenations in the same paragraph as the space in question, and these word hyphenations are sometimes worse than breaking at the space (which bothers me and hinders proper search in the resulting PDF files),

  • overfull boxes protruding into the right margin (which bothers me and the publisher very much),

  • optically dense lines (which sometimes bothers me a little bit), and

  • underfull last lines of a paragraph (which doesn't bother me, but other folks wrote they don't like it).

Therefore, I started using \penalty9999\ instead of ~ whenever the space connects two parts of a single entity, e.g., see Fig.\penalty9999\ \ref{fig:one}. If the space connects two parts of a loose entity or the two parts don't form an entity at all, I see no point in a large penalty or a penalty at all: although D.\penalty1\ E.\penalty1\ Knuth would probably disapprove of it \cite{KnuthOnTypesetting}. Pick your own positive but small penalty values (i.e., not necessarily 1). Though some folks dislike numbers or [identifiers in brackets] starting a new line, the outcome I get is often visually more pleasant than with line breaks elsewhere. Subjectively, of course.

In narrow text (say, in the margin notes), I even sometimes had to introduce negative penalties to encourage line breaks between the words and discourage hyphenation: not\penalty-3\ a\penalty-3\ suggestion\penalty-3\ of\penalty-3\ D.\penalty-1\ E.\penalty-1\ Knuth\penalty-2\ \cite{KnuthOnTypesetting}. Of course, adjust your negative penalties accordingly (for me, they were usually somewhere in [-2000,-100]).

AlMa1r
  • 411