Part III The l3names package—Namespace for primitives, Section 1 Setting up the LaTeX3 programming language of interface3.pdf says:
This module is entirely dedicated to primitives (and emulations of these), which should not be used directly within LaTeX3 code (outside of “kernel-level” code). As such, the primitives are not documented here: The TeXbook, TeX by Topic and the manuals for pdfTeX, XeTeX, LuaTeX, pTeX and upTeX should be consulted for details of the primitives. These are named
\tex_⟨name⟩:D, typically based on the primitive’s ⟨name⟩ in pdfTeX and omitting a leading pdf when the primitive is not related to pdf output.
And in the answer to some question I recently read these statements:
One should never use
\scantokensin expl3 code.
One should never use\...:Dcontrol sequences in expl3 code.
I stumbled over an issue where I don't see how to do it without \scantokens/\tex_scantokens:D:
I use xparse and intend to pass a +v-argument containing \verb*|...|-directives to some function for re-tokenizing the input.
In the following (first) example \tex_scantokens:D is used and everything works out as expected by me:
\documentclass{article}
\usepackage{xparse}
\ExplSyntaxOn
\group_begin:
\char_set_catcode_other:N ^^M
\use:n{
\group_end:
\NewDocumentCommand\PassVerbArgToRetokenizer{+v}{
\group_begin:
\tl_set:Nn \l_tmpa_tl {#1}
\exp_args:Nnc \use:n { \exp_args:Nno \tl_put_right:Nn { \l_tmpa_tl } } {@percentchar}
\exp_args:Nno \tl_put_left:Nn {\l_tmpa_tl} {\token_to_str:N\endgroup ^^M}
%\tl_show:N \l_tmpa_tl
\tex_newlinechar:D=\tex_endlinechar:D
\exp_args:NV \tex_scantokens:D {\l_tmpa_tl}
}
}
\ExplSyntaxOff
\begin{document}
\PassVerbArgToRetokenizer|\verb*+Some verbatim stuff!+|
\end{document}
You get a .pdf-file containing the following:
In the following (second) example \tl_rescan:nn is used instead of \tex_scantokens:D and you get an error-message:
! LaTeX Error: \verb ended by end of line.
\documentclass{article}
\usepackage{xparse}
\ExplSyntaxOn
\group_begin:
\char_set_catcode_other:N ^^M
\use:n{
\group_end:
\NewDocumentCommand\PassVerbArgToRetokenizer{+v}{
\group_begin:
\tl_set:Nn \l_tmpa_tl {#1}
\exp_args:Nnc \use:n { \exp_args:Nno \tl_put_right:Nn { \l_tmpa_tl } } {@percentchar}
\exp_args:Nno \tl_put_left:Nn {\l_tmpa_tl} {\token_to_str:N\endgroup ^^M}
%\tl_show:N \l_tmpa_tl
%\tex_newlinechar:D=\tex_endlinechar:D
%\exp_args:NnV \tl_rescan:nn {\tex_newlinechar:D=\tex_endlinechar:D} {\l_tmpa_tl}
\exp_args:NnV \tl_rescan:nn {} {\l_tmpa_tl}
}
}
\ExplSyntaxOff
\begin{document}
\PassVerbArgToRetokenizer|\verb*+Some verbatim stuff!+|
\end{document}
My questions are:
- The result when using
\scantokens/\tex_scantokens:Ddiffers from the result when using\tl_rescan:nn. Which essential/crucial difference between\scantokens/\tex_scantokens:Dand\tl_rescan:nncauses the difference in the results?
(The answer to the first question may enable me to answer the following questions myself.)
(I suppose it has to do with \tl_rescan:nn triggering insertion of the entire sequence of tokens that result from "re-scanning" the entire sequence into the token-stream at once: In the example TeX takes an active + for the second delimiter of \verb*'s argument. (This is done by some lowercase-trickery with active ~.) If things are already tokenized and inserted at once, the second + will already be of catcode 12(other) at the time of carrying out \verb*, thus TeX won't find a matching delimiter denoting the end of \verb*'s argument.
If I got it right \tl_rescan:nn on the one hand triggers TeX to re-tokenize (and this way produce tokens) the entire ⟨tokens⟩-sequence immediately and to immediately and in one go and at once append the entire resulting token-sequence to the token-stream in TeX's gullet.
\scantokens/\tex_scantokens:D on the other hand triggers TeX to re-tokenize things (and this way produces tokens) on demand only, bit by bit, hereby using \scantokens's argument as source of input (as if the tokens of the argument were written to file unexpanded) rather than some .tex-input-file, and each time it produces tokens it produces only as many tokens as were demanded by the other digestive organs. The other digestive organs in turn bit by bit process these tokens and hereby, e.g., carry out catcode-changes denoted by these tokens. These changes in turn may affect how subsequent things get (re-)tokenized—be it via \scantokens, be it via reading/processing a .tex-input-file, be it via reading from the console.)
- Where is my misunderstanding regarding how
\tl_rescan:nnworks/what\tl_rescan:nndoes? - What did I do wrong in the second example?
- How can I achieve by means of things provided by expl3 without using
\tex_scantokens:Dwhat I get in the first example where\tex_scantokens:Dis used?

\tl_rescan:nnis not completely equivalent to\tex_scantokens:D(quite obviously), but we still don't have an interface for it, so there are places where you have to use the primitive (see the note I recently added to the:Dspecifier here). What you are trying to do is more or less what Pablo did inscontents. The main difference is that\tl_rescan:nnscans the entire token list with the current catcode setup (plus<setup>), whereas\tex_scantokens:Dmay change catcodes as it goes. – Phelype Oleinik Nov 17 '20 at 20:04\tex_scantokens:Dprimitive is that the former uses a fixed catcode\scantokenstime-intervals of delivering characters to the tokenizing-apparatus are mixed with time-intervals in which tokens are produced from these input-... – Ulrich Diez Nov 18 '20 at 16:10\scantokens-- directives inside the\verbor theverbatim-environment bring along such directives) do not affect how subsequent things within that argument get re-tokenized because these subsequent things are already re-tokenized when it comes to carrying out these directives. – Ulrich Diez Nov 18 '20 at 16:10