Let's say, I want to replace all occurrences of \A{ <balanced text> } in my token list with \B{ \A{ <the same content> } }.
e.g.
\process{\A{123}} % -> \processresult = \B{\A{123}}
\process{before \A{1{2}3} after} % -> \processresult = before \B{\A{1{2}3}} after
\process{before {\A} after} % -> don't need to support this case, but probably no-op
Is there an easy way to do that?
It would be nice if the code also support:
- support "balanced text" have arbitrarily deep (instead of e.g. 1 level)
- preserve the char code of
{} - preserve white spaces
I don't think l3regex can do this? (both (.*) and (.*?) doesn't care about balance level)
actually I already wrote myself the "hard" way using \tl_analysis_map_inline:nn.
I think it's also possible to do with a recursive function, although I'm not sure if it can preserve char code of {}.
Example how (naive) application of regex won't work:
%! TEX program = lualatex
\documentclass{article}
\begin{document}
\ExplSyntaxOn
\def\process #1 {
\def __a {#1}
\texttt{ input:~ \exp_args:NV \detokenize __a } \par
\regex_replace_all:nnN {\c{A} \cB{ (.*) \cE}} {\c{B} \cBx \1 \cEy } __a
\texttt{ output:\exp_args:NV \detokenize __a } \par
}
\process{before \A{1{2}3} \A{1{2}3} after}
\def\process #1 {
\def __a {#1}
\texttt{ input:~ \exp_args:NV \detokenize __a } \par
\regex_replace_all:nnN {(\c{A} \cB{ .*? \cE})} {\c{B} \cB[ \1 \cE] } __a
\texttt{ output:\exp_args:NV \detokenize __a } \par
}
\process{before \A{1{2}3} \A{1{2}3} after}
\ExplSyntaxOff
\end{document}
Result:
input: before\A {1{2}3}\A {1{2}3}after
output:before\B x1{2}3}\A {1{2}3yafter
input: before\A {1{2}3}\A {1{2}3}after
output:before\B [\A {1{2}]3}\B [\A {1{2}]3}after
Here the new brace groups are [] instead of {} to show the difference clearly... it can be seen the content inside is not balanced.
\A{..}, to do\B{\originalA{..}}? – David Carlisle Mar 26 '22 at 08:13\A(X)→\B{A(X)}? – user202729 Mar 26 '22 at 10:51\B {\A {1{2}3}}after the substitution. Do you mean you want the braces to print? (or the token list not to expand?) – Cicada Mar 26 '22 at 12:56\tl_show:Nshows that the token list contains:\B {\A {123}},\B {\A {1{2}3}}and{\B {\A {}}}(when the 3rd case is adjusted to{\A{}}; plus, it is a separate structure from the first two, so needs its own regex). The TL prints accordingly (I made working dummy\Aand\Bdefinitions, each to print#1), as per catcodes, so catcodes are OK. – Cicada Mar 27 '22 at 08:45\B{...}around it. And TeX checks for balanced groups only when expanding macros or performing definitions and assignments. Now, what if\Ais a two argument macro? You'd want\B{\A{1}{2}}; what if\Ahas delimited arguments? And so on. – egreg Mar 27 '22 at 10:21{ \some \content \EXECUTETHIS {\something} \EXPANDANDINSERTTHIS {\ int_eval:n {1+1}} \some \other \content }and the result is e.g.{ \some \content 2 \some \other \content}) -- as I said, I already implement the thing by counting braces, just want to see if there's some other simpler approaches I missed. // library code still very wrong – user202729 Mar 27 '22 at 10:27