Why don't c- and f-type expansions respect protected commands?

Question

all functions without a star are (or at least should be) \protected, this way they don't expand on x or e type expansion or when written to a file.

f and c type expansion are different and ignore \protected

Seeing as it was deemed valuable to have expansion types such as x and e that respect \protected commands, why aren't there c- and f-type expansion variants that respect \protected commands? (The c-variant would cause a compilation failure if it encountered a \protected command, and that's a perfectly reasonable, and even useful, semantics, as far as I can see.)

In fact, the idea of expanding a token list till it reduces to TeX primitives seems to me to run counter to the philosophy of expl3 that intends to make the underlying TeX stratum invisible and inaccessible: all TeX primitives have been provided with expl3 wrappers, and many of these wrappers are marked with :D indicating that even they should not be used, let alone the underlying primitive.

@DavidCarlisle I fail to see how this justifies the behavior of expl3. If this were a reasonable justification, you could simply forego the entire expl3 project, and say of the various TeX primitives: "This behavior is as specified in the TeX book". — Evan Aad, Jan 01 '23 at 10:37
There's a balance between performance consideration and abstraction. Look at the performance of simulating e-type expansion in older engines, it's some >200 times slower than when the primitives is available. Now imagine that applied to every c-type and f-type expansion... — user202729, Jan 01 '23 at 10:38
the behaviour seems perfectly reasonable, especially for c and f is pretty weird anyway and largely replaced by e now the tex engines have been extended with an \expanded primitive. — David Carlisle, Jan 01 '23 at 10:43
Thinking about it, even if the answer is "if it's available it would be so slow that nobody would use it", it might be useful to provide it in a sort-of "debug environment" where it errors out on \protect, and then turn it off in "production". We know for sure the LaTeX team currently doesn't (think that they) need such a feature though. — user202729, Jan 01 '23 at 10:43
As I see it, this inconsistency in the behavior of the various expansion specifiers is a breeding ground for subtle usage errors on the part of the users of the expl3 programming language, and for hard to debug programs. — Evan Aad, Jan 01 '23 at 10:47
@DavidCarlisle As someone who knows how the primitives work you're probably (understandably?) biased on that count. — user202729, Jan 01 '23 at 10:49
It is not an inconsistency, it is the reason and core part of the design. Each type corresponds to a necessarily different, documented, expansion behaviour. — David Carlisle, Jan 01 '23 at 10:49
@EvanAad As I mentioned above, your proposals cannot be implemented without being 200 times slower. — user202729, Jan 01 '23 at 10:50
Quite apart from the technical questions, one has to ask what would \protected mean in a c-type expansion? Simply leaving macro names there doesn't really 'work'. For active chars, we are moving toward a setup that uses \ifincsname to produce 'the char itself' in those contexts, but that's done in the definitions of the actives (and therefore is not dependent on expl3-specific code) — Joseph Wright, Jan 01 '23 at 11:15
This feels like a discussion not really a question that we can provide an objective answer to (beyond 'it's the engine behaviour'): one for a chat room? — Joseph Wright, Jan 01 '23 at 11:17

David Carlisle · Accepted Answer · 2023-01-01T13:43:01.817

7

The underlying motivation for the L3 expansion arguments is to provide a convenient documented control over expansion of tex arguments. Moving too far from the underlying TeX expansion model (eg not using #1 ... #9 for arguments) can quickly make the system hundreds of times slower. In general having slow definition forms is not so bad, but slowing down the use of commands can make the system unusable. (Remember most of this mechanism was implemented around 1992, but it's only in recent years it has been anything like fast enough to be usable.

Classic LaTeX \protect mechanism works by prefixing tokens by \protect which can have different definitions in different contexts. In general the e-tex \protected mechanism is more efficient and produces more understandable expansion paths without hidden internal \protect-prefixed macros, but it does have the disadvantage that a \protected macro acts as specfied by e-tex in all cases, you can not locally change the behaviour as with \protect. Even detecting a \protected macro is used would require checking the \meaning of every token of every argument.

So f is documented type that internally uses \romannumeral expansion to expand inital tokens to the first non-expandable token. This is what it is, there is really no way to abstract this description to something that does not involve discussion of "non expandable tokens" Making it hundreds of times slower to leave it still weird but with a different behaviour for \protectedcommands would have little advantage. Basically the current f behaviour corresponds to "what tex does at the top level": If you use a \protected command in a paragraph, \protected is ignored, the command expands until the first non expandable token is found.

f is now largely replaced by e, but note e is only really usable as tex engines were extended with \expanded and e maps straight to the primitive behaviour. Implementing e in macros was possible (if you are Bruno) but impossibly slow.

c is similar, \protected could concievably have been defined to give errors in \csname but it was not. A such c maps directly to \csname so inherits that behaviour. You can not locally change \protected to change this, you would have to do a string test on \meaning of every token, so making every use much slower just to raise an error in cases that currently work without error.

edited Jan 01 '23 at 13:43

answered Jan 01 '23 at 11:21

David Carlisle

757,742

1

Perhaps worth noting that we (I) did implement a token-by-token approach for file names that explicitly looked for (protected) active chars, but the performance was no acceptable and we ended up using a combined \romannumeral/\csname approach. (Also could mention that with a move to \ifincsname/\protected and \expanded, we may yet move again, but using control at the point of definition of active chars.) – Joseph Wright Jan 01 '23 at 11:24
So, to the best of my understanding, the core reason why f-type expansion doesn't respect expl3-protected functions, is performance considerations. And the reason why f-type expansion performs well is that it is mapped to a TeX primitive, namely \romannumeral. Then it would be fair to conclude that if the TeX processing engines add a primitive that would be a variant of \romannumeral which respects \protected commands, then the door would open to implementing, in expl3, a parameter specifier that acts like f but respects expl3-protected functions. Is this correct? – Evan Aad Jan 02 '23 at 05:56
And a similar evolution happened when the engines added, not too long ago, the \expanded primitive, which resulted in a shift in preference to use the e and x specifiers over f, right? – Evan Aad Jan 02 '23 at 05:58
1

@EvanAad we added \expanded so we could add e which basically replaces f and x. I could not imagine any use cases for a modified f so I don't think it likely that such a modified romannumeral is added. where would this stop? o is \expandafter which also ignores \protected status. – David Carlisle Jan 02 '23 at 08:56
@DavidCarlisle LaTeX3 is a conceptual cataclysm, that is intended to last for decades. The engines need to rally behind the new vision. "Where would this stop"? Wherever it takes to make the engines able to support the new vision. So what if there's another primitive for f-expansion, and another one for o-expansion? Considering the magnitude and significance of the new paradigm shift this is minutiae. – Evan Aad Jan 02 '23 at 11:09
1

sorry I really can not understand why you would want that feature, in what way would it be preferable, e has massive advantages over x in that it is expandable, so worth the disruption of adding a new primitive. You are suggesting adding new primitives and new expl3 interfaces without offering any use cases or advantages over the current ones. @EvanAad – David Carlisle Jan 02 '23 at 11:40
@DavidCarlisle Can you think of any use case for f-expansion that doesn't involve legacy code? If the answer is "yes", then the exact same use case is a use case for f-expansion that respects protected functions, once you commit to the LaTeX3 vision of doing away with an input stream of TeX tokens in favor of an input stream of expl3 tokens. – Evan Aad Jan 02 '23 at 12:50
1

@EvanAad why do you say "respect" as if it is clear your proposed primitive is more natural than the current one. It is not a matter of respecting or not respecting. I have been using expl3 f expansion for 30 years, I don't think I have ever in that time wished for the variant you propose, so I ask again what use cases do you have in mind and how would you document to the end user when to use f and when to use new-f ???? – David Carlisle Jan 02 '23 at 12:56
Let us continue this discussion in chat. – Evan Aad Jan 02 '23 at 12:58

score 5 · Answer 2 · answered Jan 01 '23 at 14:53

Don't confuse \protect with \protected.

The latter is a primitive prefix for \def \gdef \edef \xdef and announces TeX that the defined control sequence (or active character) should behave in e-expansion or x-expansion as if it were prefixed by \noexpand.

e-expansion is used in the replacement text passed to \edef or \xdef and in the argument to \expanded or \write.

For instance,

\def\AAA{xyz}
\expanded{\noexpand\AAA}
\protected\def\BBB{xyz}
\expanded{\BBB}

would produce \AAA and \BBB respectively; macro expansion will take place later, following the usual rules.

Why is \protected important? The above silly examples don't tell us. Suppose you want to write some text out to an auxiliary file to be \input later. In the text you want to write out there is the control sequence \foo that you don't want to be expanded when \write is carried out, but only when you \input the file.

The traditional way would be

\def\foo{}% initialize
\newwrite\outstream=myauxfile
\begingroup\let\foo\relax
\write\outstream{some text with \foo}
\endgroup
\def\foo{Useful definition}
\input myauxfile

which works because \relax is unexpandable and so \foo will be written out as \foo. You need no \let if you do instead

\protected\def\foo{}% initialize
\newwrite\outstream=myauxfile
\write\outstream{some text with \foo}
\endgroup
\def\foo{Useful definition}
\input myauxfile

(maybe again using \protected when you redefine \foo, depending on the actual job you want to do).

To the contrary, \protect is not a primitive. It's a control sequence whose meaning changes depending on the context. In the earliest version of LaTeX you could find

\def\LaTeX{\protect\pLaTeX}
\def\pLaTeX{<the actual code for the logo>}

So if \LaTeX was found during typesetting, \protect would do nothing. But when \LaTeX was found in the argument to \protected@edef or \protected@write (wrappers for \edef and \write), the meaning of \protect would change to \noexpand (this is the basic idea, the actual implementation is a bit more complex) and so when the underlying \edef or \write commands are executed, \pLaTeX would remain.

Nowadays there is no explicit double definition with the p prefix, but the idea remains the same. Robusted commands are defined with \DeclareRobustCommand so that

\DeclareRobustCommand{\foo}{something}

is the same as

\expandafter\def\expandafter\foo\expandafter{%
  \expandafter\protect\csname foo \endcsname
}
\expandafter\def\csname foo \endcsname{something}

so instead of \pfoo like in the older way, a command \foo (with a trailing space in the name) is defined. This way, the written out file would show \foo (with two spaces), but this is unimportant when the file is \input.

How does f-expansion work? It exploits a slick feature of TeX (that can bite if you're not careful): when TeX is looking for an explicit <number>, it performs macro expansion until finding a space token or an unexpandable token that cannot be interpreted as providing a <number> under the current radix. The radix can be 10, 8, 16 or “alphabetic”. The rules are not difficult:

an explicit number can begin with any number of - or + tokens
a radix designation may follow and can be a backquote `, a straight quote ' or a double straight quote "
- no radix designation means base ten
- the straight quote and the double straight quote mean octal and hexadecimal respectively
- the backquote means “alphabetical” constant
digits, depending on the specified radix

An alphabetic constant is either a single character token or a control sequence of length one, such as \Q. The trick used by f-expansion is that macro expansion is done also after an alphabetic constant and the idea is that

\romannumeral-`\Q\foo

will expand \foo before typesetting the result of \romannumeral-`\Q (which is empty because the <number> is negative). However, when doing this, the status of \foo with respect to \protected is ignored. Thus, using expl3 lingo, the following will produce different results:

\cs_new_protected:Nn \aad_foo: {abc}
\cs_new_protected:Nx \aad_foo_a: { \aad_foo: }
\exp_args:NNf \cs_new_protected:Nn \aad_foo_b: { \aad_foo: }

Indeed, if you try \cs_show:N \aad_foo_a: and \cs_show:N \aad_foo_b:` you will get respectively

> \aad_foo_a:=\protected\long macro:->\aad_foo: .
> \aad_foo_b:=\protected\long macro:->abc.

The similar situation is in c-expansion, when TeX is trying to build a control sequence via \csname. In this case it's obvious to ignore \protected, because only character tokens should remain. Maybe the authors of e-TeX could have decided that in the context of c-expansion a \protected macro would get prefixed by \string; they didn't.

Does the expl3 programming language have a similar distinction to that of contemporary LaTeX2e between \protect and \protected? Or does it have only a single notion of protection? — Evan Aad, Jan 01 '23 at 16:58
@EvanAad In expl3 there's neither \protect nor \protected. They can be used, of course, but aren't part of the language. — egreg, Jan 01 '23 at 17:26

Why don't c- and f-type expansions respect protected commands?

2 Answers2

Linked