9

I'm trying to define a macro which grabs everything until the next # (parameter token).
My twisted imagination wants something like this:

\def\test#1###2{(#1)[#2]}
\test hello#{world}

to grab hello in #1 (delimited by #) and world in #2 (brace delimited) and then print

(hello)[world]

However I'm failing (miserably) because no matter what combination of ## I try, TeX yells back:

! Parameters must be numbered consecutively.
<to be read again> 
                   ##
l.1 \def\test#1##
                 #2{(#1)[#2]}
?

so I guess that simply writing down the # in there is not the way to go.

Is it possible somehow to have a #-delimited macro?

  • why the # not simply \test hello{world} ? :-) – David Carlisle May 17 '19 at 11:33
  • @DavidCarlisle I was trying to scan the parameter text of a macro looking for its arguments one by one. – Phelype Oleinik May 17 '19 at 11:38
  • it's hard (not really possible) to even find out how many arguments a macro has, https://tex.stackexchange.com/questions/305806/is-there-a-reliable-way-to-get-the-number-of-arguments-of-a-command – David Carlisle May 17 '19 at 11:45
  • "Scan parameter text of a macro" -- if this means evaluating the result of \meaning\macro: With \meaning you don't have information about category codes. The meaning of the following macros looks the same but the 1st one does process two args and the 2nd one has just a delimiter and does not process args: 1) \def\macro#1#2{#1 text #2} 2) \catcode`\#=12\relax\def\macro#1#2{#1 text #2} . Also, you are not bound to using hashes for denoting args. You can use any character after assigning catcode 6 to it. You can also use control-sequences/active chars let equal to catcode-6-chars. – Ulrich Diez May 17 '19 at 20:26
  • @UlrichDiez Hm, now I see I did not phrase my question properly. What I wanted to achieve (and already changed my mind) was to scan a definition before the actual definition took place (something like \scandef\def\test#1{something with #1}), not with \meaning, so the hashes do have catcode 6 and, in this case, it doesn't matter which character they are because TeX will not allow this. Thanks for the input, though :-) – Phelype Oleinik May 17 '19 at 20:35
  • @PhelypeOleinik But this is feasible to some degree: Have \scandef catch both the definition-command, the macro-name and the parameter-text into an argument via #{-notation and then iteratively examine that argument token-wise, taking into account the fact that parameter-text cannot contain {... You can, e.g., implement a loop which counts the hashes in the sequence formed by the definition-command, the control sequence-name and the parameter-text. – Ulrich Diez May 17 '19 at 20:47
  • @UlrichDiez Yes, I changed my approach to token-by-token like you said before even asking the question. I asked more out of curiosity than anything else :-). I tried the #-delimited approach first precisely because the parameter text can't have braced groups, so it wouldn't skip any #1{#2} and I think the code is simpler that way. – Phelype Oleinik May 17 '19 at 20:52
  • @UlrichDiez It's actually for a draft of a package I did (more for personal use than anything else). It allows you to \named\def\test#[name]{Hello #[name]!}. It's pretty stupid, but it makes life easier when you have macros with lots of arguments and change their order once every two lines of code :P Here's the source, if you want to take a look. – Phelype Oleinik May 17 '19 at 21:08
  • @PhelypeOleinik I often encounter the problem which you approach with your package. In daily usage I tend to write "underlying macros" and "user-level-macros": A user-level-macro via \setkeys only process a mandatory keyval-argument for (re)defining some control-word-tokens which have meaningful names and then calls its belonging "underlying macro" which in turn does not process any argument at all but instead uses those (re)defined control-word-tokens with the meaningful names... Of course this approach does not work out with macros that are intended to work in pure-expansion-contexts. – Ulrich Diez May 17 '19 at 21:28
  • @UlrichDiez Precisely :-) I'm writing the interface I describe here, which I'm making fully expandable. At the user-level the macro has only two arguments, so it's fine. However internally there are some macros which take up to 7 arguments (I'll probably optimize this later), and my brain cannot keep track of all of them that easily :) – Phelype Oleinik May 17 '19 at 21:35
  • @PhelypeOleinik If I got it right, you need a routine which within (almost) arbitrary token sequences expandably replaces (almost) arbitrary token sequences by digits denoting argument numbers while within the token sequence (definition-texts) where things shall be replaced by argument-number-digits, several levels of brace nesting/of nesting of matching arbitrary catcode-1/2-character-token-pairs could occur. By now I don't see a perfect slution for this where, e.g., arbitrary catcode-1/2-character-token pairs are left in place and are not replaced by { respective }. Besides this... – Ulrich Diez May 19 '19 at 13:22
  • ... I don't see an expandable method for, e.g., distinguishing active characters let equal to their non-active pendants from these pendants. E.g., I don't know a method for expandably distinguishing catcode-6-A from catcode-13-A after \catcode`\A=6\relax \let\A=A \catcode`\A=13\relax \let A=\A unless using predefined macros where catcode-13-A is used as argument delimiter. You'd need such a macro for each character/code-point which in the input can possibly occur. On utf8-machines you'd need a lot of such macros. In this case distinguishing \A from A when \escapechar is -1... – Ulrich Diez May 19 '19 at 13:37
  • @UlrichDiez Those are indeed good points to consider. However the interface will be split in two main types; one which scans arbitrary text and expands \catcode`[=1\catcode`]=2 \printf{Integer \%02d and \textbf[float \%6.4f]}{3,pi} to Integer 03 and \textbf{float 3.1416}, and a more dedicated interface, \printf_f_type:nnnn{}{6}{4}{pi} which should be safer in a programming level. The former interface is supposed to be more user-level, where you don't expect weird catcode settings, but if a user does that in an actual document they should know what they are signing up for ;-) – Phelype Oleinik May 19 '19 at 15:50

2 Answers2

6

The TeXbook, page 203, says in the first doubly dangerous paragraph

Now that we have seen a number of examples, let’s look at the precise rules that govern TeX macros. Definitions have the general form

\def⟨control sequence⟩⟨parameter text⟩{⟨replacement text⟩}

where the ⟨parameter text⟩ contains no braces, and where all occurrences of { and } in the ⟨replacement text⟩ are properly nested. Furthermore the # symbol has a special significance: In the ⟨parameter text⟩, the first appearance of # must be followed by 1, the next by 2, and so on; up to nine #’s are allowed.

There is no way for the parameter text to contain a (category code 6) #, because of the rule stated above.

As usual in the TeXbook, this is not the complete truth; in the second doubly dangerous bend on page 204 one reads

A special extension is allowed to these rules: If the very last character of the ⟨parameter text⟩ is #, so that this # is immediately followed by {, TeX will behave as if the { had been inserted at the right end of both the parameter text and the replacement text. For example, if you say ‘\def\a#1#{\hbox to #1}’, the subsequent text ‘\a3pt{x}’ will expand to ‘\hbox to 3pt{x}’, because the argument of \a is delimited by a left brace.

However, this special extension has no favorable consequence towards your aim.

egreg
  • 1,121,712
  • Oh, must be followed :/ – Phelype Oleinik May 17 '19 at 12:01
  • @PhelypeOleinik Yes, not the weaker “should” or “ought to”. – egreg May 17 '19 at 12:07
  • 1
    although (not unusually for the texbook) that rule isn't actually completely true (you can follow the last # by { rather than a digit). So that dangerous bend on its own wouldn't be enough to confirm it wasn't possible (but it isn't possible:-) – David Carlisle May 17 '19 at 12:11
  • @DavidCarlisle Yes, I added the relevant quotation. – egreg May 17 '19 at 12:15
  • There is module 476 in tex.pdf, which I sort of understand (because I already know what it does (sort of)). It first looks for a left brace, if it's not then steps a counter t and if the grabbed token isn't equal to t the error message in the question is printed. – Phelype Oleinik May 17 '19 at 12:20
  • @PhelypeOleinik That's it: the rule disallows # not followed by the expected digit except at the end. – egreg May 17 '19 at 12:25
  • The last # can be followed by any token whose meaning at the time of defining equals the meaning of the catcode-1-{. Even if after defining the meaning of that token gets changed, that token will be used as a delimiter which will be left in place: \let\Weird={ \def\test#1#\Weird #1 weird is } \def\Weird{\TeX!} \test Really\Weird \bye – Ulrich Diez May 17 '19 at 20:42
  • @UlrichDiez Yes, I know the feature, but it's a particular that's not really connected to the question at hand. I was thinking to add this exception to the rule that the replacement text has to start with an explicit {, but decided not to. Your comment is valuable on this respect. – egreg May 17 '19 at 20:46
5

You can't really do what you ask, but you can ignore the # while parsing the arguments, then get rid of it:

enter image description here

\def\test#1#{\def\tmp##1{#11}\zz}
\def\zz#1{(\tmp{})[#1]}

\test hello#{world}

\bye
David Carlisle
  • 757,742
  • “You can't really do” is acceptable (although frustrating). Is there anywhere that explains this (not that I'm doubting you ;-)? – Phelype Oleinik May 17 '19 at 11:58
  • @PhelypeOleinik egreg's answer shows some sort of documentation, although as i commented there the texbook often"clarifies" rules later so it is hard to use it as a definitive source, there is always tex-the-program..... – David Carlisle May 17 '19 at 12:13