10

Example:

\documentclass{article}

\begin{document}

\section{Power series}\label{ps}

\begin{equation} \label{eq} \sum_{i=0}^\infty a_i x^i. \end{equation}

The equation \ref{eq} is a typical power series.

\section{Conclusion} Look at Section~\ref{ps}.

\end{document}

In order to get the references, latex must be run twice, and an auxiliary file is created. Whereas, look at the following analogous (less verbose) OpTeX example:

\sec[ps] Power series

$$ \sum_{i=0}^\infty a_i x^i. \eqmark[eq] $$

The equation \ref[eq] is a typical power series.

\sec Conclusion Look at Section~\ref[ps].

\bye

We need to run optex only once to get the references. Moreover, no auxiliary (.ref) file is created, because in this example references can be already resolved in one pass.

So, I don't understand why LaTeX always forces a two pass approach even when a single pass can work as well, and can deal with the most common kinds of references (future references are rarer).

User
  • 2,530
  • 1
  • 14
  • 25
  • 4
    You provide the answer to your own question in the final sentence: forward (or future) references. What is the basis for your claim "future references are rarer"? What would you do about citation call-outs, which are nothing but forward references to items in the bibliography, which (in the vast majority of real-world documents) occurs toward the end of the document? – Mico Mar 20 '21 at 07:20
  • @Mico Yes, you are right about citations for instance. What I mean is: why does LaTeX forces also sigle pass jobs into two pass jobs? If I have no future references, why do I need to process the document twice? Why for instance this example document should be processed twice, when the references could be resolved with just one pass? – User Mar 20 '21 at 07:41
  • 1
    Your hypothetical, "if I have no future references", pretty much provides the best answer to your question of why there's no variant of TeX or LaTeX that is designed not to work with forward references. When given a choice between two imperfect systems -- say, the current system, which requires two passes but gets by with a single executable; and a system that you might prefer but which requires users to actively switch to a different executable if and when their document takes on a more real-world flavor -- most users will gladly choose the one that's requires less active work of them. – Mico Mar 20 '21 at 08:00
  • The label had to store also the page number which is available only at shipout. If you want a simpler system implement it: It should be easy. You only need to store @currentlabel in some command. – Ulrike Fischer Mar 20 '21 at 08:03
  • @Mico "requires users to actively switch to a different executable if and when their document takes on a more real-world flavor ". That is not true. optex handles also the case of future references, when they are present, with a two pass job. – User Mar 20 '21 at 08:08
  • @UlrikeFischer Yes, a simpler system already exists, as in the example shown. I was just wondering why LaTeX doesn't take the simple approach, when the complicated one is not necessary. – User Mar 20 '21 at 08:10
  • The sample code you posted created the impression, maybe unintentionally, that you were interested in a discussion about LaTeX-based systems. AFAICT, OpTeX is for plainTeX, not LaTeX, users. Unless and until LaTeX runs on OpTeX, I think it's not helpful to point to the existence of OpTeX as a supposed counterexample to my earlier claims. – Mico Mar 20 '21 at 08:18
  • @Mico Sorry then, probably I misunderstood your comment. Thanks for the answers. – User Mar 20 '21 at 08:21
  • 2
    latex exists since 30 years and I have lots of trust in the creativity of its package authors. So either someone already wrote an extension that does what you want or the need is not there. Check https://ctan.org/topic/label-ref. While it is certainly possible to implement this, I personally never had a document where I thought it would be nice to have this, With the exception of short tests, all my documents are compiled more than once anyway so that it doesn't matter. – Ulrike Fischer Mar 20 '21 at 09:27
  • @User - You "probably misunderstood" my comment? I'd say that you understood it just fine but, for whatever reason, chose to misrepresent it. Your query has "LaTeX" in the title, provides LaTeX sample code, and ends with the claim that "I don't understand why LaTeX forces a two pass approach when a single pass can work as well" [emphasis added]. Did I maybe misunderstand the gist of your claims? – Mico Mar 20 '21 at 12:21
  • @Mico Yes, sorry, maybe my question was a little bit confusing. Regarding the optex example, of course I am aware that forward references need two passes. The example was meant to show that when the references can be resolved in a single pass, then optex does it, while latex uses the second pass in any case. – User Mar 20 '21 at 12:24
  • 2
    Note, that when bib-reference list is after all \cite commands (which is very common) then you need only two passes: optex, optex without calling external program. LaTeX user needs four calls: latex, biber, latex, latex. Or, when creating Index, you need optex, optex (without calling any external program), but LaTeX user needs latex, makeindex, latex. :) – wipet Mar 20 '21 at 13:06
  • 1
    Actually, sometimes more than two passes are needed. It is even possible to write documents that never converge. – Hagen von Eitzen Mar 21 '21 at 11:46
  • For some examples of the comment by @HagenvonEitzen, see Document requiring infinitely many compiler passes?. – David Hammen Mar 21 '21 at 16:24

2 Answers2

26

"Why" questions are not easy to answer, especially as the original system was designed for machines with a tiny fraction of the memory that current machines have, but..

It is true, as you point out, that it would be possible in some cases for backward counter references to be resolved in one pass, but this would not be possible for page references and not possible for forward references.

Firstly it is worth noting that this makes absolutely no difference in practice, so it's just a theoretical question:

  • If you are writing the file, very few people write the entire document in one session and just run latex once and use the pdf, so resolving cross references does not require any extra runs of latex. They are resolved by the time they need to be resolved.

  • If you are processing a complete document that you have been given as source then usually it contains a table of contents or list of figures or references to a bibliography, so would take multiple passes to resolve all references even if \label was changed to allow backward \ref in a single pass.

Technically \label causes an ordered pair (or quadruple with hyperref) of data for the current counter and page reference to be saved. By saving them to the aux file and reading them back at the start, both values are either known or not and usable for forward or backward references. This simplifies the implementation and especially it simplifies documenting the behaviour.

Note that optex was designed in an era when machines had much more available memory and by a different person, so it is not that surprising that it takes different design decisions in some cases, neither is right or wrong.

The fact that latex is documented as a multipass system with data stored to the filesystem between passes rather than in memory is just an optimisation to give the user the possibility of just using a single pass after each edit. You could just as easily advertise it as a system that resolves all cross references (as latexmk or context do for example) by doing multiple runs as an "internal implementation detail" and resolving all cross references before returning a PDF to the user.

user202729
  • 7,143
David Carlisle
  • 757,742
13

You start from a wrong point of view.

With OpTeX you do get a .ref file written, in case one run is insufficient to solve cross references, which can happen in two cases:

  1. a future reference, or
  2. a label has been set in the vicinity of a page break, and the page number at the time the label is seen is not the same as what is at output time.

The first case is clear. For the second case, consider

Some text to artificially push the section too near the page break.

\vskip\dimexpr\vsize-4\baselineskip\relax

\sec[ps] Power series

$$ \sum_{i=0}^\infty a_i x^i. \eqmark[eq] $$

The equation \ref[eq] is a typical power series.

\sec Conclusion Look at Section~\ref[ps] at page~\pgref[ps].

\bye

and you'll see that a .ref file will be created. The log will contain

WARNING l.18: Try to rerun to get references right.

Due to the asynchronous mechanism of page breaking, there is no way to resolve cross references, even past ones, in one pass. As you see, even past references may need two passes.

If you just want back references to numbers and not to pages, then it should work even with hyperlinks, but it seems quite a big limitation.

egreg
  • 1,121,712
  • 1
    I really don't understand your example. The eq and ps labels are resolved in one pass, whereas the xx label is never resolved, because it is nowhere defined. The .ref file is created just because optex reasonably thinks that xx will appear later in the document. So what? What were you trying to show? – User Mar 20 '21 at 10:56
  • @User Sorry, used wrong version of the example. – egreg Mar 20 '21 at 11:09
  • Ok, thank you for the example. In any case, even without the \vskip\dimexpr\vsize-4\baselineskip\relax, optex writes to the .ref file, because it is aware of this eventuality. So, the pgref references are always handled by default with a two-pass approach. – User Mar 20 '21 at 11:44
  • @User Also per-page footnotes would need two passes. I see no real problem in running twice over the document: TeX syntax errors are almost unavoidable, not to mention typos. – egreg Mar 20 '21 at 11:47
  • Ok, thank you for your point of view :) – User Mar 20 '21 at 11:49
  • I noticed that with the \pgref command, in optex three passes are necessary. If instead \openref is added to the beginning of the document (this basically says to use the .ref file in any case), then two passes are needed. – User Mar 20 '21 at 11:58
  • 2
    It seems the @egreg's example not to be surprising. When \pgref is needed then two passes are needed because of asynchronous processing typessetting material by TeX. This is one of principle of TeX itself. – wipet Mar 20 '21 at 14:32