4

What I want to do is take some input string, process each character or "token" individually, and output something based on what said character/token is. Is there a way to do this? Perhaps it could look something like this:

\for\char\in#1
\ifx\char\textbackslash
...
\fi
...
\fi
Someone
  • 539
  • 1
    You might look at the xstring package, or https://tex.stackexchange.com/questions/233085/basics-of-parsing?r=SearchResults&s=1|22.2927 – John Kormylo Apr 14 '20 at 16:18
  • tex doesn't have strings just tokens, and it depends a bit what you mean by "character" for example £ is two tokens (with hex codes C2 A3) – David Carlisle Apr 14 '20 at 16:18
  • @DavidCarlisle Of topic: The fact that £ is two token is only development deadlock and very bad concept. For example, in pdfcsplain (using pdfTeX) the £ is single token. And in XeTeX and LuaTeX the £ is single token too, of course. On topic: TeX must know what sequence of tokens will be treated. How to specify this? "Someone" should give more information what is his intention. – wipet Apr 14 '20 at 16:31
  • @wipet I think there is a fairly high probability that it's two tokens in the system the OP is using – David Carlisle Apr 14 '20 at 16:34
  • Note that you cannot test a "normal" TeX code by \ifx\nextchat\textbackslash because \textbackslash almost never occurs as a token in the TeX code. Tokenizer interprets backslash with very special manner (with its default setting) and almost never generates single token backslash. – wipet Apr 14 '20 at 16:42
  • 3
    It would be better if you show a more sensible example and describe with more details the strings you expect to loop on. – egreg Apr 14 '20 at 17:00
  • I don't need that many special characters, just "normal" code. – Someone Apr 15 '20 at 13:11
  • Right now, I'm using XeTeX – Someone Apr 15 '20 at 13:12

2 Answers2

3

Note TeX doesn't have strings and character tokens do not necessarily correspond to what you might call a character, for example £ is two tokens, however latex has a built in loop over tokens:

\documentclass{article}

\begin{document}

\makeatletter

\def\zzz{b}

\@tfor\tmp:=abcdef\do{
[ \tmp\ is
\ifx\tmp\zzz
 b
\else
 not b
\fi
]\par}


\end{document}

produces

enter image description here

David Carlisle
  • 757,742
3

The tokcycle package is designed to cycle through input tokens, and take actions based on whether the token is a "character", a group, a macro/command sequence, or a space.

The directives allow one to apply conditional tests to the token to achieve the desired output. Here I place parens around every character token, except for e, which I make bold. If a macro is \today, it is set in italic, if it is \textbackslash, it is \fboxed---otherwise it is merely echoed to the output. Spaces are converted to \textvisiblespaces, while also allowing for line breaks.

Notably, the token cycle can work its way into group content, unless one wishes that to be purposely precluded. It is shown below in its pseudo-environment form, but has macro forms, as well.

\documentclass{article}
\usepackage{tokcycle}
\begin{document}
\tokencycle
{\ifx e#1\addcytoks{\textbf{#1}}\else\addcytoks{(#1)}\fi}%
{\processtoks{#1}}%
{\ifx\today#1\addcytoks{\textit{#1}}\else
 \ifx\textbackslash#1\addcytoks{\fbox{#1}}\else\addcytoks{#1}\fi\fi}%
{\addcytoks{\textvisiblespace\allowbreak}}%
These are \underline{difficult times}, \today{} of all days!

Note that I seek out instances of \textbackslash today in order to make
  it italic.  Paragraphs are not a problem.
\endtokencycle
\end{document}

enter image description here

  • Apparently \today takes an argument. What can you put in it? – Someone Apr 15 '20 at 13:59
  • @Someone no standard definition of \today takes an argument. – David Carlisle Apr 15 '20 at 14:01
  • @Someone \today does not take an argument. However, the syntax \today{} is used so that the spaces following the macro name are not auto-absorbed. Thus \today x has no space before the x, whereas \today{} x has a space before the x. – Steven B. Segletes Apr 15 '20 at 14:02
  • Couldn't you just use ~? – Someone Sep 28 '20 at 23:05
  • @Someone Yes, one could use it. This was just for demonstration purposes. – Steven B. Segletes Sep 28 '20 at 23:34
  • How about processing only THE last character? – Someone Oct 07 '20 at 17:50
  • @Someone Do you mean last char in input stream or last character in each word? In either case, one has to save the character under consideration until the next token is examined, and only then decide if and in what manner to add the prior token to the output stream. – Steven B. Segletes Oct 07 '20 at 20:37
  • the last character of the entire string. – Someone Oct 08 '20 at 17:06
  • \tokcycleenvironment\lastchar {\gdef\recentchar{##1}} {\processtoks{##1}\gdef\recentchar{\egroup}} {\gdef\recentchar{##1}} {\gdef\recentchar{##1}} {\gdef\recentchar{##1}} \lastchar {xyz}\today\endlastchar [\detokenize\expandafter{\recentchar}]. This example gives \today as last token of the input {xyz}\today. The only issue here is that this simple approach gives \egroup if the last character is }. @Someone – Steven B. Segletes Oct 08 '20 at 18:14