Generally prefer underfull to overfull?

Question

To me, LaTeX too much prefers overfull to underfull by default. What's the parameter to shift this preference toward underfull?

For my current draft, I was surprised that I needed to go to \linebreak[4] to avoid an overfull line with a protrusion into the the right margin by ~5% of linewidth, and I was even more surprised that the resultant underfull line after the insertion of \linebreak[4] didn't look bad at all: You wouldn't have even noticed that it was underfull by LaTeX's standards.

The line in question starts a paragraph and includes nothing special (no math) except for a \cref and a longish "word" at the end that I don't want to break. I use the standard article class.

On the other hand, if you look at MS Word, you can see that it prefers underfull very much and produces ugly lines like

Interword     spaces      are      too   much
blah blah blah blah . . . . . . . . .    blah

Perhaps this is too much but still more acceptable than a 5% protrusion into the margin. So, I want to move LaTeX somewhat in this direction.

I still have a lot more overfull lines with much less protrusion but I don't have any underfull lines. I still want warnings. I just want to shift the preference somewhat toward underfull.

Edit: Thanks to @David Carlisle's answer, I've arrived at a "partial solution": I've just added these two lines

\tolerance 1000%
\emergencystretch 3em%

to my preamble. The problematic line is no longer overfull and has become (without \linebreak) a justified line with somewhat wider spacing than normal. The result is much better than the original overfull line, where the spacing was so narrow as to affect the legibility a bit and with the egregious protrusion into the right margin. I looked over the entire document but didn't find any egregious underfull lines.

So, my conclusion, at least for the type of document I write with the regular article class, is that LaTeX by default is a bit too intolerant to underfull lines.

The remaining problem is that I don't get underfull warnings for the lines which were judged to be overfull in the default settings. I just want be made aware of the potential problem. Probably this is an impossible request?

The lua-typo package, which @rallg kindly pointed out, didn't change the warnings.

There is \sloppy (as opposed to \fussy). Most authors and journals try to cram as much text as possible into each page. — John Kormylo, Jan 25 '24 at 16:35
It seems like it would be helpful if you provided the content of the line in question so that we can see what you're talking about. — Teepeemm, Jan 25 '24 at 17:43
Slightly related: If you happen to compile using lualatex, then you can use package lua-typo. Among other things, you can tell it to warn you when lines are underfull by a certain amount. Does not fix anything, merely provides info (better than fixing). So then you can use David's info (below link) to allow more underfull, and have lua-typo warn you when it happens. — rallg, Jan 25 '24 at 17:46
This is not LaTeX feature but it is TeX feature. You can pay attention to the settings of these TeX primitive registers: \pretolerance, \tolerance, \emergencystretch. Of course, you have to understand how TeX calculates the badness of lines and the sum of demerits of the resulting paragraph. — wipet, Jan 25 '24 at 18:48
@Teepeemm "It seems like it would be helpful if you provided the content of the line in question so that we can see what you're talking about" . . . Of course. I usually do that for this forum, but in this particular case, it involves \cref and to replicate the problem I'm seeing would either take a lot of trial-and-error or need to include the figure environment referred to by \cref. I just wanted to see whether my original question would be sufficient. I may modify my original posting to include a code when I get more free time — Ryo, Jan 26 '24 at 05:10
The % after 1000 and em should better be omitted. It doesn't make a difference here, but in other situations it could. — egreg, Jan 26 '24 at 09:13

score 6 · Answer 1 · answered Jan 25 '24 at 17:33

6

overfull boxes are never preferred, they are always infinitely bad and only taken when all other options are also infinitely bad so by that stage there is no parameter that can help. What you can do eg with \sloppy is to allow white space to stretch more so the stretched lines are not infinitely bad, then they will be chosen over overfull.

answered Jan 25 '24 at 17:33

David Carlisle

757,742

see also https://tex.stackexchange.com/questions/50830/do-i-have-to-care-about-bad-boxes/50850#50850 – David Carlisle Jan 25 '24 at 17:35
Both overfull and underfull are infinitely bad. In that case, TeX chooses overfull. That means TeX "prefers" overfull to underfull, correct? So the parameter to adjust is \sloppy because it makes underfull less bad than infinitely bad. Is this interpretation of mine correct? – Ryo Jan 26 '24 at 04:58
@Ryo yes and yes. – David Carlisle Jan 26 '24 at 09:23

score 1 · Answer 2 · answered Jan 26 '24 at 09:57

One should start from the assumption that bad paragraphs are, well, bad.

There's just one way to coerce TeX into producing a paragraph with underfull lines, namely to insert explicit penalties that make underfull boxes more attractive (according to the badness calculation TeX performs).

Otherwise TeX will prefer to produce a paragraph with overfull lines, because they're easier to debug, since the printout will show a black blob at the spot (in LaTeX you need the draft document class option).

\documentclass[twocolumn]{article}
\usepackage{kantlipsum}
\begin{document}
\tracingparagraphs=1 \tracingonline=1
\kant[1]
{\hyphenpenalty=-5000 \kant[1]}
\end{document}

Of course also a \penalty-10000 command in the paragraph will have the effect of making underfull lines “preferable”. One usually doesn't type that, but \break in plain TeX or \linebreak in LaTeX.

The first doubly dangerous paragraph on page 107 of the TeXbook is particularly interesting:

If you want to avoid overfull boxes at all costs without trying to fix them manually, you might be tempted to set \tolerance=10000; this allows arbitrarily bad lines to be acceptable in tough situations. But infinite tolerance is a bad idea, because Te\ doesn't distinguish between terribly bad and preposterously horrible lines. Indeed, a tolerance of 10000 encourages TeX to concentrate all the badness in one place, making one truly unsightly line instead of two moderately bad ones, because a single “write-off” produces fewest total demerits according to the rules. There's a much better way to get the desired effect: TeX has a parameter called \emergencystretch that is added to the assumed stretchability of every line when badness and demerits are computed, in cases where overfull boxes are otherwise unavoidable. If \emergencystretch is positive, TeX will make a third pass over a paragraph before choosing the line breaks, when the first passes did not find a way to satisfy the \pretolerance and \tolerance. The effect of \emergencystretch is to scale down the badnesses so that large infinities are distinguishable from smaller ones. By setting \emergencystretch high enough (based on \hsize) you can be sure that the \tolerance is never exceeded; hence overfull boxes will never occur unless the line-breaking task is truly impossible.

This is what \sloppy does:

% latex.ltx, line 15143:
\DeclareRobustCommand\sloppy{%
  \tolerance 9999%
  \emergencystretch 3em%
  \hfuzz .5\p@
  \vfuzz\hfuzz}

_{9999% and 3em% should not be imitated in a document; they're here for historical reason; better not add % in those cases.}

If you have a particularly tough paragraph, you have several options available, in order of preference:

reword;
add a feasible hyphenation point that's missed by the algorithm
remove a tie;
try to guess a possible \linebreak
add {\sloppy\par} at the end of the paragraph

With the last option you might be warned about underfull boxes.

Example:

\documentclass[draft]{article}
\usepackage{kantlipsum}
\begin{document}
\vbox{\hsize=12em
 \kant*[1]
}
\end{document}

_{\vbox{\hsize=12em ...} instead of \parbox{12em}{...} because \parbox applies \sloppy}

This will show six overfull lines. With

\vbox{\hsize=12em
 \kant*[1]{\sloppy\par}
}

you get five underfull lines, with largest badness 1917. So when you have decided that such a solution is the best one (the least bad, actually) you can set \hbadness=1917 so as not to pollute your log file with unnecessary information.

\kant*[1]{\hbadness=1917 \sloppy\par}

Of course this should be done in the VERY FINAL revision. Remember to clearly mark these paragraphs in the typescript. Defining a macro such as

\newcommand{\SOLVEPAR}[1]{\unskip{#1\relax\sloppy\par}}

can help, because it can be easily found, but also made into a no-op. So the offending paragraph might be

\kant*[1]\SOLVEPAR{\hbadness=1917}

"One should start from the assumption that bad paragraphs are, well, bad." . . . Why doesn't TeX assign different badness scores to different situations? Your "bad" seems to mean "infinitely bad". Reading your and David's answer, I still don't understand one thing: Why make them "infinitely bad"? If you assign finite badness scores to an overfull and to an underfull, you can numerically chose one over the other. — Ryo, Jan 27 '24 at 04:04
"Otherwise TeX will prefer to produce a paragraph with overfull lines, because they're easier to debug," . . . But the overfull and underfull warnings are there to help the user to spot either problems. — Ryo, Jan 27 '24 at 04:16

Generally prefer underfull to overfull?

2 Answers2