LaTeX3: tl_map with spaces

Question

Is there a version of tl_map that does not ignore spaces that are part of a token list?

\documentclass{article}
\usepackage{expl3}
\begin{document}
\ExplSyntaxOn
\tl_new:N \l_token_tl
\tl_set:Nn \l_token_tl { a~b~c }
%\tl_show:N \l_token_tl
\tl_map_inline:Nn \l_token_tl { (#1) }
\ExplSyntaxOff
\end{document}

This example does only three iterations, ignoring all the spaces.

There is a currently-internal approach for mapping allowing for spaces, but it's very rarely been required (hence being internal). Perhaps you can explain what you actually need to achieve here? — Joseph Wright, Sep 13 '14 at 19:26
I would split it up into a sequence delimited by ~; that's usually what I mean when I need this sort of thing. — Sean Allred, Sep 13 '14 at 19:30
@Joseph I have a syntax parser (mhchem package) that I would like to convert to LaTeX3. The syntax contains spaces as important markup element. I am looking for a way to get this done, but always hit a wall with LaTeX3's space handling. I.e. I do not necessarily need this tl_map version if, for instance, the peek approach would work. — mh543, Sep 13 '14 at 20:01
@mh543 Wouldn't \tl_replace_all:Nnn be enough? (I don't know anything about mhchem so probably I'm wrong.) — Manuel, Sep 13 '14 at 20:19
@mh543 When I had similar problems, I used either the method suggested by Manuel, or split the input in a sequence at spaces, so you still know where the spaces originally were. — egreg, Sep 13 '14 at 21:12
@JosephWright Even if it's possible to solve this by replacing spaces (or spaces followed by something), or using the splitting into a seqence… It would be great to know that inline mapping that does not ignore spaces. — Manuel, Sep 14 '14 at 09:50
This seems what's called an XY question: mapping tokens just to themselves is surely not what you have in mind; could you be more precise about your aim? — egreg, Sep 14 '14 at 10:29
@egreg. This is an XY question? Okay, the real X problem is to rewrite mhchem using LaTeX3. See the manual for its features. That X problem is too broad? Okay, Y1: I want to write a parser for the mhchem syntax. Still too broad? Y2: I can use a left-to-right token parsing and a state machine. To broad? Y3: For that, I need a token-by-token iteration. I tried two approaches. Y4a: Using peek's and a self-written recursion macro. (Works now, I will use tat.) Y4b: Using the the built-in LaTeX command for a token-by-token iteration, tl_map. But that does ignore spaces. Pick any problem to answer — mh543, Sep 14 '14 at 13:59
@mh543 The main problem is what you want to do with the spaces. — egreg, Sep 14 '14 at 14:16
@egreg Ah, what to do with the spaces? That depends on the current state of the parsing state machine. Most of the time, the space will simply denote the end of an equation entity (chemical equation). But in some instances, a space should be treated as a textual space, e.g. when the parser reads a space in the label of an reaction arrow. It's currently only those two cases, but I definitely want to retain the flexibility to treat every space individually depending on the current parser state. — mh543, Sep 14 '14 at 15:48
See also: package writing - Avoid @ hackery: How can I preserve spaces in a format specification when processing the specification with expl3 and PGF/TikZ? - TeX - LaTeX Stack Exchange — user202729, Oct 17 '21 at 04:22

egreg · Answer 1 · 2014-11-13T11:30:09.170

You have two strategies available.

Strategy 1: replace spaces with some function that you can redefine depending on the current state

\tl_new:N \l_mhchem_input_tl
\tl_set:Nn \l_mhchem_input_tl { a ~ b ~ c } % this would come from a macro argument
\tl_replace_all:Nnn \l_mhchem_input_tl { ~ } { \__mhchem_space_do: } 
\tl_map_inline:Nn \l_mhchem_input_tl { whatever with #1 }

Strategy 2: split the token list into a sequence that you can subsequently use.

\seq_new:N \l_mhchem_split_input_seq
\seq_set_split:Nnn \l_mhchem_split_input_seq { ~ } { a ~ b ~ c } % this would come from a macro argument
\seq_use:Nn \l_mhchem_split_input_seq { do something in place of spaces }

With \seq_set_map:NNn you can set another sequence adding something around the values stored in \l_mhchem_split_input_seq and use this new sequence.

score 1 · Answer 2 · answered Sep 14 '14 at 09:37

I prepared the mentioned \seq approach for you. I append a space to every item of the sequence and remove the resulting trailing space using \tex_unksip:D.

\documentclass{article}
\usepackage{expl3}
\begin{document}
\ExplSyntaxOn
\cs_generate_variant:Nn \seq_set_split:Nnn { NnV }

\tl_new:N \l_token_tl
\tl_set:Nn \l_token_tl { a~b~c }

\seq_new:N \l_token_seq
\seq_set_split:NnV \l_token_seq { ~ } \l_token_tl
\seq_map_inline:Nn \l_token_seq { (#1)~ }
\tex_unskip:D % remove trailing space

\ExplSyntaxOff
\end{document}

LaTeX3: tl_map with spaces

2 Answers2

Linked