5

This is a curiosity-driven question, from the perspective of a LaTeX user which has nearly no knowledge of low-level TeX typesetting primitives.

How is the LaTeX tabular environment implemented? In particular:

  • How is the width of the columns computed?
  • Are the contents of each cell stored and then typeset later?
  • Which are the underlying TeX mechanisms?
  • Which is the role of \\ and & and how is the meaning of those control sequences changed inside a tabular?
  • 1
    tex/latex/base/latex.ltx? – cfr Dec 07 '17 at 02:25
  • 3
    Or perhaps more explanatorily texdoc source2e which will bring up the documented source. Also, The TeX Book, or TeX by Topic to understand the underlying TeX implementation. – Alan Munn Dec 07 '17 at 02:26
  • 1
    This is certainly far too broad for a question on this site. No reasonable answer could possibly respond satisfactorily to all the requests listed in the desiderata. (At least, if the answer to (3) is supposed to be explanatory and not just a list.) – cfr Dec 07 '17 at 04:11
  • 2
    Of course the only precise description of a program is the program itself. However, the general design ideas and the important key points can be listed. There’s a great distance between “understanding every single line of code” and “not imagining at all how it can possibly work”. No answer can be “complete”, but someone else can add details in other answers. – Nicola Gigante Dec 07 '17 at 04:29

1 Answers1

9

LaTeX tabular is based on \halign TeX primitive. The features of \halign cannot be described in short answer fully, so excuse me a simplification.

\halign has the following syntax (roughly speaking):

\halign{<columns declaration>\cr 
      item & item & item \cr
      item & item & item \cr}

The <columns declaration> can be done using TeX primitives like \hfil, \kern etc. and LaTeX macros converts the user declarations like "lrc" to somewhat more complicated but more general <columns declaration> understandable by \halign. Moreover, LaTeX says \let\\=\cr inside tabular environment, so LaTeX user typically does not know nothing about \cr primitive which works as row separator in \halign. He/she uses \\ instead \cr.

The \halign primitive works in the two steps. First, it saves the contents of all items to individual boxes. Then it calculates the maximal natural width of such boxes in each column (say w1, w2, ..., wn). In second step, \halign puts lines in vertical mode (one above second), each line includes boxes with items re-boxed to the calculated widths w1, w2, w3, etc. Each line generated from \halign in second step looks like:

\hbox{\hbox to w1{<left declaration1> item1 <right declaration1>}%
      \hbox to w2{<left declaration2> item2 <right declaration2>}%
       ...%
      \hbox to wn{<left declaration-n> item-n <right declaration-n>}}

Because each line use the same dimensions w1, w2, ..., wn (pre-computed in the first step), the columns are aligned.

So, you are right: the content of the table items are stored and re-used, but this is done in internal algorithm of \halign primitive. Macro programmer needs not to save boxes manually (for example using \setbox).

wipet
  • 74,238
  • That’s exactly the level of detail I was looking for. Thank you :) – Nicola Gigante Dec 07 '17 at 16:19
  • There's a small detail here (which I think is pretty important) namely that, as with other such primitives (^, \tok0=... etc.) the end-delimiter does not need to be an explicit }, it can be an implicit one too or something expand to it (in other words \end{tabular} expands to something-contain-\egroup), so it can pass control to halign directly without grabbing the whole thing verbatim → allow usage of catcode-changing commands inside. See also https://tex.stackexchange.com/a/196850/250119 – user202729 May 18 '22 at 13:08