How to map arbitrary brace-balanced sequences of non-outer tokens expandably and unambiguously to numbers or to strings consisting exclusively of explicit character tokens of category 12, if possible only with macros/things that can be implemented in Knuthian-TeX?
At first I thought of stringifying all tokens in a loop and then calculating some sort of unambiguous checksum, but stringifying involves losing information about categories and therefore such an approach cannot distinguish all possible token sequences.
I would be grateful for an outline of how to approach the matter. I can then think about the details of a concrete implementation myself.
However, I have my doubts:
If this could be done in a way which is reliable to one hundred percent, then this could be used as an expandable method to distinguish, for example,
- an active character-token let equal to a non-active pendant from that pendant.
- frozen-
\relaxfrom the\relax-primitive. - the nameless control-sequence (producible via
\csname\endcsnameor via an escape-character (backslash) at the end of a line of .tex-input while\endlinecharhas a negative value) from the control-sequence whose name iscsname⟨escapechar⟩endcsname(producible via\csname csname\string\endcsname\endcsname) while those control-sequences have the same non-outer meaning. - explicit (non-outer) character token from one-letter-control-sequence let equal to that explicit character token when the character-code corresponds to the character which forms the name of the control-sequence while
\escapecharhas a negative value. - frozen font control sequences obtained by applying
\theto a font command from the original font command. - ...
Can I conclude that an expandable approach restricted to means provided by Knuthian-TeX is rather not possible in a way that is one hundred percent reliable and practical?
How to approach the matter if expandability/sticking to Knuthian-TeX is not an issue?
{and}, so there's some loss there (insignificant, for most reasonable applications, but still) – Phelype Oleinik Apr 17 '22 at 14:00]is catcode-2, then in\dostuff{hello{braced]world}you can easily see the charcode of{, but you can't(?) see that it's closed by a], right? – Phelype Oleinik Apr 17 '22 at 17:57\UD@ExtractFirstOpeningBracesMatchingClosingBraceStringifiedin my answer to Get \string-ification of first opening brace in argument?/Get \string-ification of first opening brace's matching closing brace in argument? is about. ;-) It is feasible. In some of the examplesYis of category 2, and you get the result of "hitting" thatYwith\string. – Ulrich Diez Apr 17 '22 at 22:48#<some explicit category-1-character-token>-notation still work out, but even I refuse squeezing an obscure pseudo-exception out of my brain where "laying hands" on character codes of explicit category-2-tokens might be of practical use. ;-) – Ulrich Diez Apr 18 '22 at 16:07\tl_if_head_is_space:nTFin the kernel: detecting a space is often needed, while the char code of a catcode-2 token is really not that important – Phelype Oleinik Apr 18 '22 at 16:41