9

I have a situation where I need the first and second character of a macro argument extracted separately. The input is always two characters long and is always non-special text. I am aware there are Lua or Latex3 or whatever else solutions, but if at all possible I would appreciate an answer in plain LaTeX2 with no or minimal use of packages.

junius
  • 2,058

4 Answers4

11

The LaTeX kernel already has the needed commands.

\documentclass{article}

\makeatletter \newcommand{\firstof}[1]{@car#1@nil} \newcommand{\secondof}[1]{\expandafter@car@cdr#1@nil@nil} \newcommand{\restof}[1]{\expandafter@cdr@cdr#1@nil@nil} \makeatother

\begin{document}

\firstof{ab}

\secondof{ab}

\firstof{abcde}

\secondof{abcde}

X\restof{ab}X

X\restof{abcde}X

\end{document}

The macros are defined in a very simple way:

% latex.ltx, line 846:
\def\@car#1#2\@nil{#1}
\def\@cdr#1#2\@nil{#2}

enter image description here

However, this would produce weird errors if the input is less than two token long. Here's a safer way with expl3.

\documentclass{article}
\usepackage{expl3} % not needed with LaTeX April 2020 or later

\ExplSyntaxOn \cs_new:Npn \firstof #1 { \tl_range:nnn { #1 } { 1 } { 1 } % tokens from 1 to 1 } \cs_new:Npn \secondof #1 { \tl_range:nnn { #1 } { 2 } { 2 } % tokens from 2 to 2 } \cs_new:Npn \restof #1 { \tl_range:nnn { #1 } { 3 } { -1 } % tokens from 3 to the end } \ExplSyntaxOff

\begin{document}

X\firstof{a}X

X\secondof{a}X

X\restof{a}X

X\firstof{ab}X

X\secondof{ab}X

X\restof{ab}X

X\firstof{abcde}X

X\secondof{abcde}X

X\restof{abcde}X

\end{document}

enter image description here

A generalization, where you specify the item to choose or, optionally, the start and end point. As in the code above, negative numbers mean “count from the end”. The macro is fully expandable, so it can go inside \edef.

\documentclass{article}
\usepackage{xparse}

\ExplSyntaxOn \NewExpandableDocumentCommand{\extract}{O{#2}mm} { \tl_range:nnn { #3 } { #1 } { #2 } } \ExplSyntaxOff

\begin{document}

X\extract{1}{a}X

X\extract{2}{a}X

X\extract[3]{-1}{a}X

X\extract{1}{ab}X

X\extract{2}{ab}X

X\extract[3]{-1}{ab}X

X\extract{1}{abcde}X

X\extract{2}{abcde}X

X\extract[3]{-1}{abcde}X

X\extract[2]{4}{abcde}X

X\extract{-1}{abcde}X % the last item

\end{document}

enter image description here

egreg
  • 1,121,712
  • Nice and simple answer! Though I find it necessary to wrap your definition within \makeatletter and \makeatother. – Yan King Yin Jul 16 '21 at 13:02
  • 1
    @YanKingYin The definitions of \@car and \@cdr are taken from latex.ltx, there's no need to add them in a document: they're available out of the box. In order to use them in a document one needs \makeatletter and \makeatother, as is done in the first code box. – egreg Jul 16 '21 at 13:07
  • @egreg: great approach, as usual. I kindly ask how to expand the following: \def\mytext{abcdef} X\extract[2]{4}{\mytext}X? – M. Al Jumaily Feb 11 '23 at 00:32
  • 1
    @M.AlJumaily Add \exp_args:Ne in front of \tl_range:nnn – egreg Feb 11 '23 at 09:25
  • @egreg, thank you for this! – M. Al Jumaily Feb 11 '23 at 10:10
10

You can use that macros arguments can be delimited by other things than just brace groups. So you can use a special marker that stops the scanning. Then you can pick up the bits of the argument you want and discard the rest. In order to avoid errors when the input is too short we add some dummy content at the end. (Here I used {}{}{} to just add some empty stuff at the end, but you could also use an undefined macro like \zzzextractor#1\invalid\invalid\invalid\stophere which would then throw an error if the given text is not long enough.)

\documentclass{article}

\newcommand{\zzz}[1]{\zzzextractor#1{}{}{}\stophere} \newcommand{\zzzextractor}{} % just to make sure we don't overwrite an existing macro % the real definition comes next \def\zzzextractor#1#2#3\stophere{% first: #1, second: #2}

\begin{document} \zzz{lo}

\zzz{ipsum}

\zzz{} \end{document}

example output

There are quite some limitations to this approach.

  • With pdfLaTeX this does not work for non-ASCII-characters.
  • If you feed macros to \zzz you may get unexpected results.

If your input is guaranteed to consist of only two (ASCII) characters, the following will also work

\documentclass{article}

\makeatletter \newcommand*{\zzz}[1]{% first: @firstoftwo#1, second: @secondoftwo#1} \makeatother

\begin{document} \zzz{lo}

\zzz{al} \end{document}

because \@firstoftwo is defined as taking two argument and expanding to the first and \@secondoftwo as taking two arguments and expanding to the second. If you don't brace the #1 passed to the two macros they will just grab the first two tokens as their arguments. If the argument #1 just contained two tokens that's it.

moewe
  • 175,683
  • How would you use \@firstoftwo with a macro argument that is itself a command? e.g. \zzz{\mycommand{input}} – junius Jun 13 '20 at 10:29
  • 1
    @junius You can't directly. You'd first have to expand the argument. But then it is crucial to know how \mycommand works and how it is implemented. Maybe \expandafter\zzz\expandafter{\mycommand{input}} works (if \mycommand expands to the two-letter output in one step), but maybe you need many more \expandafters or possibly \edef. – moewe Jun 13 '20 at 10:51
7

I think something of that sort exists in hundreds of variations.

\documentclass{article}
\newcommand\mytest[1]{%
\def\pft##1##2;{\def\pftfirstchar{##1}\def\pftsecondchar{##2}}%
\expandafter\pft#1;%
The first character is \textit{\pftfirstchar} end the second character
\textit{\pftsecondchar}.\par}
\begin{document}
\mytest{si} \mytest{no} \mytest{ja} \mytest{na}
\end{document}

enter image description here

And of course there exist more failsafe variants and packages and so on and so forth.

1

I have a situation where I need the first and second character of a macro argument extracted separately. The input is always (at least) two characters long and is always non-special text.

For the sake of completeness, here's a LuaLaTeX-based solution. It provides LaTeX macros called \firstchar, \secondchar, and \finalchar. Their argument may be any utf8-encoded text string -- or even one or more LaTeX macros, which will be expanded before returning the first, second, or final character of the resulting text string.

enter image description here

\documentclass{article}
\newcommand\firstchar[1]{\directlua{ tex.sprint( unicode.utf8.sub ("#1",1,1))}}
\newcommand\secondchar[1]{\directlua{tex.sprint( unicode.utf8.sub ("#1",2,2))}}
\newcommand\finalchar[1]{\directlua{ tex.sprint( unicode.utf8.sub ("#1",-1))}}

\begin{document} \obeylines % just to keep this MWE's code compact Consider the instruction \verb+\foo{Note}+. The argument's first character is \firstchar{Note}''. The argument's second character is\secondchar{Note}''. The argument's final character is \finalchar{Note}''. \medskip Consider the instruction \verb+\foo{öÄüß}+. The argument's first character is\firstchar{öÄüß}''. The argument's second character is \secondchar{öÄüß}''. The argument's final character is\finalchar{öÄüß}''. \medskip \def\a{B} \def\b{az} Consider the instruction \verb+\bar{\a\b}+, where \verb+\def\a{B}+ and \verb+\def\b{az}+. The expanded argument's first character is \firstchar{\a\b}''. The expanded's argument's second character is\secondchar{\a\b}''. The expanded's argument's final character is ``\finalchar{\a\b}''. \end{document}

Mico
  • 506,678