Apply an operation on every word/character

Question

There are some related questions:

How to repeat over all characters in a string?

But every post I found asked for something specific and it was really hard to understand which is the part actually needed for applying a certain operation on all characters/words. Let's say we have a \newcommand\command1[1]{do something} and we want to apply this on every character and \newcommand\command2[1]{do something} which we want to apply on every word. How would we go about it?

Steven B. Segletes · Answer 1 · 2022-03-22T16:03:23.720

The tokcycle package (https://ctan.org/pkg/tokcycle) is set up to cycle through an input stream of tokens and handle each one successively. It has macro and environmental forms, and discerns whether any given token is categorized as a "character", a "group", a "macro" (control sequence), or a "space" (implicit or explicit). Each of these four categories can receive its own directive on how tokens of that type should be handled.

The normal approach is for the processed tokens to be collected into a token list \cytoks, which is stored and can be regurgitated later. This type of approach will allow for macros to be collected and their arguments processed, before they are executed. If the macros in the input stream are not actually executed (let's say they are merely detokenized), then this phase of collecting tokens into \cytoks can be bypassed and tokens can be output on the fly.

In the MWE below, characters are made red and placed in parens, group content is italicized (while the tokens of the group are processed independently), macros are detokenized in blue, and spaces are presented as green visible spaces. In this case, the use of \addcytoks to collect tokens into \cytoks is not strictly needed, because macros are not executed but detokenized (via \string). However, I collect them anyhow, because it is the way a token cycle would normally proceed.

\documentclass{article}
\usepackage[T1]{fontenc}
\usepackage{tokcycle,xcolor}
\Characterdirective{\addcytoks{\textcolor{red}{(#1)}}}
\Groupdirective{\addcytoks{\itshape}\processtoks{#1}}
\Macrodirective{\addcytoks{\textcolor{cyan}{\string#1}}}
\Spacedirective{\addcytoks{\textcolor{green}{\textvisiblespace}}}
\begin{document}
\tokcyclexpress{This {is \today's} test.}
\the\cytoks
\end{document}

If one desires to process a word, rather than a character at a time, one must build the logic into the token cycle to collect the characters and to dump them when encountering a begin/end group, a macro, and/or a space.

Thus, the logic is a bit more detailed, but nonetheless straightforward.

\documentclass{article}
\usepackage[T1]{fontenc}
\usepackage{tokcycle,xcolor}
\def\theword{}
\newcommand\dumpword{\if\relax\theword\relax\else
  \addcytoks{\textcolor{red}{(}}%
  \addcytoks[1]{\theword}%
  \addcytoks{\textcolor{red}{)}}\fi
  \def\theword{}}
\stripgroupingtrue
\Characterdirective{\expandafter\def\expandafter\theword\expandafter
  {\theword#1}}
\Groupdirective{\dumpword\groupedcytoks{%
  \addcytoks{\itshape}\processtoks{#1}\dumpword}}
\Macrodirective{\dumpword\addcytoks{\textcolor{cyan}{\string#1}}}
\Spacedirective{\dumpword\addcytoks{\textcolor{green}{\textvisiblespace}}}
\newcommand\wordprocessor[1]{%
  \tokcyclexpress{#1}%
  \dumpword
  \the\cytoks
}
\begin{document}
\wordprocessor{This {is \today's} test.}
\end{document}

With more recent versions of tokcycle, the ability to look ahead one token into the input stream is provided (in the following MWE, via \tcpeek). Making use of this, the logic of word collection can be simplified, by merely determining if the next token of the input stream is another character or not.

\documentclass{article}
\usepackage[T1]{fontenc}
\usepackage{tokcycle,xcolor}
\def\theword{}
\newcommand\dumpword{%
  \addcytoks{\textcolor{red}{(}}%
  \addcytoks[1]{\theword}%
  \addcytoks{\textcolor{red}{)}}%
  \def\theword{}}
\newcommand\addtotheword[1]{%
  \expandafter\def\expandafter\theword\expandafter{\theword#1}%
  \tcpeek\z
  \ifcat A\z\else\ifcat0\z\else\dumpword\fi\fi
}
\Characterdirective{\addtotheword{#1}}
\Groupdirective{\addcytoks{\itshape}\processtoks{#1}}
\Macrodirective{\addcytoks{\textcolor{cyan}{\string#1}}}
\Spacedirective{\addcytoks{\textcolor{green}{\textvisiblespace}}}
\newcommand\wordprocessor[1]{%
  \tokcyclexpress{#1}%
  \the\cytoks
}
\begin{document}
\wordprocessor{This {is \today's} test.}
\end{document}

This code will yield an identical result as above, for the given input stream.

Such a nice and wonderful reply, it helps me a lot... – MadyYuvi Mar 22 '22 at 15:35 — MadyYuvi, Mar 22 '22 at 15:35

egreg · Accepted Answer · 2022-03-22T17:19:24.120

4

Use expl3: both commands have a second argument that's a template for what to do with the characters or the words.

In the last case, we need ##1 for the nested call.

\documentclass{article}
\ExplSyntaxOn
\NewDocumentCommand{\applytoeverychar}{mm}
 {
  \group_begin:
  \tl_set:Nn \l_tmpa_tl { #1 }
  \tl_replace_all:Nnn \l_tmpa_tl { ~ } { \c_space_tl }
  \cs_set_protected:Nn __masum_apply:n { #2 }
  \tl_map_function:NN \l_tmpa_tl __masum_apply:n
  \group_end:
 }
\NewDocumentCommand{\applytoeveryword}{mm}
 {
  \group_begin:
  \seq_set_split:Nnn \l_tmpa_seq { ~ } { #1 }
  \cs_set_protected:Nn __masum_apply:n { #2 }
  \seq_set_map:NNn \l_tmpb_seq \l_tmpa_seq { __masum_apply:n { ##1 } }
  \seq_use:Nn \l_tmpb_seq { ~ }
  \group_end:
 }
\cs_new:Nn __masum_apply:n {} % initialize
\ExplSyntaxOff
\begin{document}
\applytoeverychar{do something}{\textlangle #1\textrangle}
\applytoeveryword{do something}{\textlangle #1\textrangle}
\applytoeveryword{do something}{\textbar\applytoeverychar{#1}{\textlangle##1\textrangle}\textbar}
\end{document}

edited Mar 22 '22 at 17:19

answered Mar 22 '22 at 16:39

egreg

1,121,712

a small question. if I already have a document, is there a way to do this without adding the command \applytoeverychar before every line? – Masum Mar 22 '22 at 18:06
@Masum This looks very much like an XY-question. What's your aim? – egreg Mar 22 '22 at 18:08
I would like to apply a command on every character in the document that I have written already. The command would randomly add some modifications to the font. Does that make sense? Sorry if it's unclear – Masum Mar 22 '22 at 18:17
1

@Masum No, this is not really possible, unless you use LuaTeX (and I pass on this). Please, ask a new question with the details. – egreg Mar 22 '22 at 18:31

Apply an operation on every word/character

2 Answers2

Linked