Question about the special rule mentioned after TeXbook exercise 20.5

Question

I don't quite understand the special rule that follows texbook exercise 20.5:

A special extension is allowed to these rules: If the very last character of the < parameter text > is #, so that this # is immediately followed by {, TEX will behave as if the { had been inserted at the right end of both the parameter text and the replacement text. For example, if you say \def\a#1#{\hbox to #1}, the subsequent text \a3pt{x} will expand to \hbox to 3pt{x}, because the argument of \a is delimited by a left brace.

If I understand correctly (which I don't), if the opening parenthesis {which starts the replacement text is immediately preceded by a hash character # like this #{ then this opening parenthesis is at the same time a delimiter which marks the beginning of the replacement text and indicates that this parenthesis is also part of the parameter text (which it can never be in the general case).

But then, I don't understand why this \def\a#1#{\hbox to #1} definition applied to the \a3pt{x} example is extended to \hbox to 3pt{x}.

Why is there the "x" in 3pt{x}?

I don't understand the explanation given:

because the argument of \a is delimited by a left brace

Translated with www.DeepL.com/Translator (free version)

The example with x is just the material that will be boxed up in the resulting \hbox of 3pt width. It has nothing to do with the #1# notation, but will be grabbed by the \hbox primitive. — Skillmon, Dec 23 '19 at 18:38

egreg · Accepted Answer · 2019-12-22T20:57:29.040

With \def\a#1#{\hbox to #1}, a call such as

\a3pt{x}

makes TeX find 3pt as the (delimited) argument. This triggers the replacement and you get

\hbox to 3pt{x}

The left brace has not yet been removed from the input stream. The argument to \a is whatever lies between \a and the first {₁ token.

Arguments to macros are of two types: undelimited and delimited. In the parameter text an undelimited argument is of the form #<n> (with <n> an integer from 1 to 9) directly followed by the next #<n+1> parameter or by the {₁ (left brace) that starts the replacement text. Otherwise an argument is delimited.

When determining the arguments to a macro at call time, TeX distinguishes between delimited and undelimited arguments. If it's looking for an undelimited argument, there are two cases: if a {₁ token follows, then the argument is the whole sequence of tokens up to the matching }₂ token; otherwise the next token is the argument.

If TeX is looking for a delimited argument, it will absorb tokens until finding the exact sequence of tokens specified as delimiter. All tokens so absorbed (excluding the delimiting tokens) form the argument; the delimiting tokens are discarded. During such process, TeX does no interpretation whatsoever of the tokens it absorbs.

Exception, which is directly related to exercise 20.5: if the parameter text is #1# (but it could be more complicated), then the delimiter of the argument #1 (in general of the last argument) is {₁ which is not discarded as it would be with standard delimited arguments.

The conversation in comments has stopped last year, but for preserving and contunuing if needed this conversation has been moved to chat. Thanks! — Stefan Kottwitz, Jan 17 '20 at 03:25

Skillmon · Answer 2 · 2019-12-22T19:01:58.407

You can think of it like this:

\def\foo#1;{}

will read everything after \foo until there is a ;, so \foo 123; will read 123 as #1 and remove ; from the input stream as well.

Similarly

\def\foo#1#{}

will create a macro that has a right delimited argument, but instead of ; this right delimiter is {, so it reads everything up to the next opening brace, but contrarily to the first case, in this case { will not be removed from the input stream (well, actually it will be removed but reinserted by \foo's replacement text). So \def\foo#1#{} behaves like \def\foo#1;{;} but uses { as the right delimiter instead of ;.

You can see that the opening brace is actually removed and later reinserted by taking a look at the \meaning (or \showing the definition):

\def\foo#1#{}\show\foo
\def\foo#1#{foo}\show\foo

will print

> \foo=macro:
#1{->{.
> \foo=macro:
#1{->foo{.

to the terminal.

EDIT: This tries to answer the comment

As I understand it, the parameter text is a regular expression. In this regular expression the opening parenthesis { is a delimiter of the replacement text and is therefore removed when replacing. This is not the case when it is immediately preceded by #. What is the purpose of this rule?

If TeX would also remove that opening brace what would be left would be an unbalanced token list (an unmatched closing brace). Therefore if the following macros aren't created very carefully, this would throw an error, and creating macros with the same logic would be much harder. So the only purpose of this rule is that you can create macros which are right delimited by a token of category code 1 (a { in normal catcodes) without needing a lot of macros to sanitize the now unbalanced input stream.

Imagine the following situation:

\def\foo#1#{}
\foo 123{abc}

After the definition of \foo and its expansion what would be left in the input stream if the { wasn't reinserted would be

abc}

and we'd have to somehow sanitize that unmatched closing brace. Say we want to create a macro which reads everything up to an opening brace and the next group, what we'd have to do now would be to create a macro that grabs every token until it meets a closing brace, but \def\bar#1}{} will throw an error as well, so how should we create this? What we'd need to do would be something like the following (note that I create the unbalanced text by expanding an \iffalse{\fi in the following):

\documentclass[]{article}

\makeatletter
\long\def\grabuntilclosingbrace@fi@firstoftwo\fi\@secondoftwo#1#2{\fi#1}
\def\grabuntilclosingbrace
  {%
    \begingroup
    \aftergroup\grabuntilclosingbrace@done
    \grabuntilclosingbrace@a
  }
\def\grabuntilclosingbrace@a
  {%
    \futurelet\grabuntilclosingbrace@tok\grabuntilclosingbrace@b
  }
\def\grabuntilclosingbrace@b
  {%
    \ifx\grabuntilclosingbrace@tok\egroup
      \grabuntilclosingbrace@fi@firstoftwo
    \fi
    \@secondoftwo
    {%
      \afterassignment\grabuntilclosingbrace@final
      \let\afterassignment@tok=%
    }
    {%
      \grabuntilclosingbrace@c
    }%
  }
\def\grabuntilclosingbrace@final
  {%
    \aftergroup\grabuntilclosingbrace@end
    \endgroup
  }
\long\def\grabuntilclosingbrace@done#1\grabuntilclosingbrace@end
  {Argument was: \texttt{#1}}
\long\def\grabuntilclosingbrace@c#1{\aftergroup#1\grabuntilclosingbrace@a}
\makeatother

\begin{document}
\expandafter\grabuntilclosingbrace\iffalse{\fi abc}
\end{document}

And this macro can't deal with spaces or nested groups. See how complicated life would be, if TeX didn't give us the super handy \def\foo#1#{} rule?

If you want to know, what this rule can be used for:

Say we have some macro that can't deal with nested groups in its argument, so we have to test whether the argument has a group, after all we want to give a helpful error message instead of just letting our macro fail. So we need to create a test that tests for a nested group. With the logic of \def\foo#1#{} we can reduce this to a test whether an argument is empty (this reuses code/ideas from https://tex.stackexchange.com/a/517265/117050).

\documentclass[]{article}

\makeatletter
\long\def\ifgroupin#1%
  {%
    \ifgroupin@a#1{}\ifgroupin@tokB\ifgroupin@false
    \ifgroupin@tokA\ifgroupin@tokB\@firstoftwo
  }
\long\def\ifgroupin@a#1#{\ifgroupin@b}
\long\def\ifgroupin@b#1{\ifgroupin@c\ifgroupin@tokA}
\long\def\ifgroupin@c#1\ifgroupin@tokA\ifgroupin@tokB{}
\long\def\ifgroupin@false\ifgroupin@tokA\ifgroupin@tokB\@firstoftwo#1#2{#2}
\makeatother

\begin{document}
\ifgroupin{abc}{true}{false}

\ifgroupin{a{b}c}{true}{false}
\end{document}

An alternative version which uses the classic \if\relax\detokenize{#1}\relax empty test, because this might be easier to understand (but takes about 160% the time of the previous implementation):

\makeatletter
\long\def\ifgroupin#1%
  {%
    \if\relax
      \detokenize\expandafter\expandafter\expandafter{\ifgroupin@a#1{}}\relax
      \expandafter\@secondoftwo
    \else
      \expandafter\@firstoftwo
    \fi
  }
\long\def\ifgroupin@a#1#{\@gobble}
\makeatother

As I understand it, the parameter text is a regular expression. In this regular expression the opening parenthesis { is a delimiter of the replacement text and is therefore removed when replacing. This is not the case when it is immediately preceded by #. What is the purpose of this rule? — AndréC, Dec 22 '19 at 17:35
@AndréC see my edits to get an idea why TeX reinserts the opening brace and what this can be used for. — Skillmon, Dec 22 '19 at 18:42
@AndréC oh, and actually the opening brace is removed, it just gets reinserted like it's done by my \def\foo#1;{;} example. — Skillmon, Dec 22 '19 at 18:45
@egreg what's incorrect? That the opening brace is discarded? That's correct. Take a look at the \meaning of a macro defined with #{. — Skillmon, Dec 23 '19 at 21:27
@Skillmon With \def\a#1#{#1} I get macro:#1{->#1{ I can't see where the { is discarded. But this is just nitpicking and more of a discussion on conventions about what \meaning displays; anyway, it has nothing to do with your semicolon example. — egreg, Dec 23 '19 at 21:33
@egreg it is discarded the same way as \def\foo;{;} discards the ; but later reinserts it. So technically it is discarded. I don't see what's incorrect about that statement. — Skillmon, Dec 23 '19 at 21:36
@egreg of course. It is exactly the same, just that the token discarded and later reinserted is of different character and category codes. — Skillmon, Dec 23 '19 at 21:41
@Skillmon No. Try \catcode`[=1\def\a#1#[#1} and you'll see macro:#1[->x[ — egreg, Dec 23 '19 at 21:45
@egreg which is still the same but with different character and category codes. — Skillmon, Dec 23 '19 at 21:52
@Skillmon Not at all. The same character with the same category code. Otherwise the example in exercise 20.5 wouldn't work. — egreg, Dec 23 '19 at 21:54
@egreg that was a misunderstanding. \def#1#{} is the same as \def#1;{;} but with different character and category code in that instead of { is discarded and reinserted, ; is discarded and reinserted, which is the same but with a different character and category code than {. — Skillmon, Dec 23 '19 at 22:05
Skillmom, thank you for your efforts. Your explanations are certainly pertinent, but very difficult for me to understand. @Egreg's answer and comments, going to the point, allowed me to understand that in reality I didn't really understand the basic workings of the expansions (an input stream read character by character). I thank you for your efforts and wish you a Merry Christmas. — AndréC, Dec 25 '19 at 07:14

Question about the special rule mentioned after TeXbook exercise 20.5

2 Answers2

If you want to know, what this rule can be used for: