56

Is there a good reference for LaTeX's output routine? The documented source is confused and confusing. The authors seem confused as to why parts are the way they are and wonder if maybe things should be changed:

Not sure about these: two questions. Should things which must apply to a whole doument be local or global (they probably should be ‘preamble only’ commands)? Are these three such things?

My favorite quotes so far are the following gems.

This is a very much an emergency action, just dumping everything: footnotes first then floats. A more sophisticated version is needed; but even more urgent is a bug-free version (see, for example, pr/3528).

and

We empty any left over kludge insert box here; this is a temporary fix. It should perhaps be applied to one page of cleared floats, but who cares? The whole of this stuff needs completely redoing for many such reasons.

I suspect that tex.stackexchange is not the right place for explaining what the entire output routine is doing, but I'd appreciate any pointers to a clear explanation. I'm especially interested in why the float mechanism invokes the output routine (with large negative penalties), sometimes multiple times; how pages of floats are processed; what these kludge insert boxes are; and what hooks class authors can use.

lockstep
  • 250,273
TH.
  • 62,639
  • @TH Please feel free to add your input to my post below by editing it! I am sure you have a couple of sources up your sleeve and I think you came up with a great question that needs to be split in twenty parts and discussed here:) – yannisl Jan 11 '11 at 17:39
  • @Yiannis: I've really just been reading source2e.pdf. – TH. Jan 11 '11 at 22:17

1 Answers1

52

The output routine is called either by TeX's normal page-breaking mechanism, or by a macro putting a penalty of < or = -10000 in the output list. These large penalties communicate with the OTR. For example a penalty of -10001 is a clearpage, whereas a -10004 is a float insertion etc.

Information on LaTeX output routine is very hard to find - and guessing from the comments in LaTeX's source, it is also hard to follow even for the LaTeX team!

The output routine is one of the more mysterious pieces of TeX. The chapter of the TeXbook discussing output routines claims that designing output routines makes one achieve the level of a Tex Grandmaster.

As is so often the case, mastery of the concept of an output routine in plain TEX will only barely prepare you for the complexities awaiting you with LaTeX’s variant of an output routine. However, it is better to start by studying TeX's OTR first. Luckily, there is some help in a series of articles by that great TeX exegete David Salomon. They are all available online as TUGBoat articles.

Output Routines: Examples and Techniques. Part I: Introduction and Examples.

Output Routines: Examples and Techniques. Part II: OTR Techniques

Output Routines: Examples and Techniques. Part III: Insertions

Output routines: Examples and techniques Part IV: Horizontal techniques

Read the last part first!

For LaTeX you can read Frank Mittelbach's, published paper xo-pfloat.pdf in which he explains some of the problems facing the team, when dealing with the output routine. Reading it you will appreciate that floats is still one of the hard Computer Science problems and feel a bit of sympathy for Microsoft trying to do it interactively for multi-page documents!

There is also an article by David Kastrup Output Routine Requirements for Advanced Typesetting Tasks (Proceedings of EuroTEX 2003) outlining some of the difficult areas and specifications for generic routines

This would give you a bit of background to start deciphering source2e itself. It is not all that hard, but one needs to get a good grounding at the standard building blocks such as insertions lists, here points etc.

In a nutshell all floats are put in boxes and then lists and unboxed by the algorithm. sometimes mind-boggling lists such as this.

\gdef\@freelist{\@elt\bx@A\@elt\bx@B\@elt\bx@C\@elt\bx@D\@elt\bx@E
                 \@elt\bx@F\@elt\bx@G\@elt\bx@H\@elt\bx@I\@elt\bx@J
                 \@elt\bx@K\@elt\bx@L\@elt\bx@M\@elt\bx@N
                 \@elt\bx@O\@elt\bx@P\@elt\bx@Q\@elt\bx@R}

These boxes are sometimes not enough and great insight can be obtained by reading the documentation of related packages such as morefloats, which simply adds more boxes to the above list using 100's of expandafters!

There are some useful macros in the code - one the way of using @elt lists (they just equivalent to Knuth's double slashes). @elt is just a Lisp relic and is an abbreviation for element). Also look up @bitor, xbitor etc..

Looking for hooks? Perhaps you can use the AtBeginDvi in a similar way that bobhook used it to add water-marks to a page.

Lastly, just to touch on the kludgeins. Depending on one's interpretation they can either be an ill-assorted collection of poorly-matching parts, forming a distressing whole or the more German witty or smart, I go for the latter! My favourite quote from the source!

The star form of this command is dedicated to Leslie Lamport, the other we need for ourselves (FMi, CAR).

Great Team with a good sense of humour! Can't wait to hear from the other members here of the LaTeX3 way!

yannisl
  • 117,160
  • 7
    You can tell FMi's mood from the comments: some are very witty. I'm afraid I'm much more boring: if you read any of the LaTeX3 stuff I'm responsible for you just get the facts. The OR is on the list for this year: I'm not entirely looking forward to it! – Joseph Wright Jan 11 '11 at 19:16
  • But many others do. ;-) – lockstep Jan 11 '11 at 19:29
  • 1
    @lockstep. We're just finalising the next LaTeX3 news, which will have some 'aims' for the coming months. I'll see what makes the list. – Joseph Wright Jan 11 '11 at 19:42
  • @Yiannis: Thanks! I've spent a lot of time looking through how the float mechanism works. I can't say I understand why it needs to use two runs of the output routine to add the float to the list. The first run does almost nothing. That said, I'm looking forward to reading all of this! – TH. Jan 11 '11 at 22:15
  • @Yiannis: For me, this (ab)use of marking as inline code is a bit strange. Why not slant it instead? – Hendrik Vogt Jan 12 '11 at 13:21
  • @Hendrik You mean instead of the shaded part to use italics? – yannisl Jan 12 '11 at 15:07
  • @Yiannis: Exactly. The shaded formatting should be reserved for code related stuff. (And the italics here aren't really italics but slanted, as far as I understand.) The same applies to my other comment to another answer of yours. – Hendrik Vogt Jan 12 '11 at 15:18
  • @Hendrik Good idea will edit a bit later on. Also please free to edit any of my answers - anytime! – yannisl Jan 12 '11 at 15:23
  • @Yiannis: Thanks for your permission. I just edited that other answer. – Hendrik Vogt Jan 12 '11 at 15:29
  • @Yiannis: Same here. Do you really want to emphasize "communicate"? And: I don't know why you marked "hard" (before "Computer Science problems"). Can you please tell me what you had in mind? – Hendrik Vogt Jan 12 '11 at 16:33
  • @Hendrik The word "communicate" is a TeX community term used to denote methods to pass messages to the OTR. Normally when a new term is used I would use italics (here I guess will accept slanted). The term hard was also emphasized to give it a bit of a special meaning. In normal TeX I tend to generally use italics rather than quotation marks. – yannisl Jan 12 '11 at 16:49
  • @Yiannis: Thanks. Putting things in slanted (as we don't have italics here) is good for emphasizing, and I think I do now understand that you want to emphasize hard. And I see, it's also OK to use it for quoting things. So, thanks for clarifying. – Hendrik Vogt Jan 12 '11 at 17:03
  • 1
    You missed a golden opportunity to coin the word "TeXegete"… – Seamus Feb 29 '12 at 17:49
  • @Seamus There you are, I will use it next time and you are the official godfather:) – yannisl Feb 29 '12 at 17:55
  • What do you mean by "Knuth's double slashes"? – Weißer Kater Apr 22 '23 at 12:10