0

I want to use \futurelet to test the following token. Towards this end I have a setup that looks like the following:

\makeatletter
\def\@ExpandSub{%
    \typeout{Given Input: \meaning\@input}%
    \typeout{Followup command: \meaning\@following}%
\ifx\@following\UTFviii@three@octets%  <--- doesn't work!
    \typeout{>>> That's unicode!}%
\else%
    \typeout{>>> That's NOT unicode...}%
\fi%

}

\protected\def\sub#1{% \let@input#1% \futurelet@following@ExpandSub% } \makeatother

  1. How can I test if the next token is a Unicode character \ifx\@following\UTFviii@three@octets doesn't work, I think because \@following holds a protected macro that holds \UTFviii@three@octets instead of \UTFviii@three@octets itself. The logs say: Followup command: \protected macro:->\UTFviii@three@octets �, but I want the replacement macro as it was specified inside \newunicodechar instead.
  2. If this is the case, how can I get the substitute macro, as defined by \newunicodechar?

This is an attempt to implement Automatically combine unicode double subscripts aᵢⱼ = a_{i}_{j} as a_{ij}.

Full Demo
\documentclass{article}
\usepackage[utf8]{inputenc}
\usepackage{etoolbox}
\usepackage{newunicodechar}

\makeatletter \newtoggle{insub} \def@ExpandSub{% \typeout{Given Input: \meaning@input}% \typeout{Followup command: \meaning@following}%

% if \@following is \UTFviii@three@octets, get the replacement text
\ifx\@following\UTFviii@three@octets%  <--- doesn't work!
    \typeout{>>> That's unicode!}%
\else%
    \typeout{>>> That's NOT unicode...}%
    \let\@actual\@following
\fi%

\iftoggle{insub}{
    \@input%
}{
    \toggletrue{insub}%
    \sb\bgroup\@input%
}%
\ifx\@actual\sub%
    \typeout{keep going!}%
\else%
    \typeout{Stop!}%
    \egroup\togglefalse{insub}%
\fi%

}

% \DeclareRobustCommand*{\sub}[1]{\futurelet@following@ExpandSub{#1}} \protected\def\sub#1{% \let@input#1% \futurelet@following@ExpandSub% } \makeatother

\AtBeginDocument{ \newunicodechar{ᵢ}{\sub{i}} \newunicodechar{ⱼ}{\sub{j}} \newunicodechar{ₖ}{\sub{k}} \newunicodechar{ₗ}{\sub{l}} \newunicodechar{ₘ}{\sub{m}} \newunicodechar{ₙ}{\sub{n}} }

\begin{document}

\begin{tabular}{l} $a_{ijklmn}$ \ %$a\chain{i}\chain{j}\chain{k}\chain{l}\chain{m}\chain{n}$ \ $a\sub{i}\sub{j}\sub{k}\sub{l}\sub{m}\sub{n}$ \ % $a\sub[i]\sub[j]\sub[k]\sub[l]\sub[m]\sub[n]$ \ $aᵢⱼₖₗₘₙ$ \end{tabular} \end{document}

Related:

  • Wouldn't it be easier to test if the next byte is one of (the leading byte of) ᵢⱼₖₗₘₙ? It is very hard to expand \@following while it could be a unicode byte as well as primitive commands such as \mathrm or \kern. – Symbol 1 Feb 14 '24 at 20:35
  • 1
    Quick note: I implement this in one of my libraries https://github.com/user202729/TeXlib/blob/main/unicode-math-input.sty#L505-L541 . Basically, first test if the following token is a N-type, if it is grab the next byte, set escapechar to something nonempty, test the stringfication of it to see if it consist of one character, if yes then branch over the character value to determine how many octet to grab. --- Although of course there are different ways as well. --- To get the inner macro, it's stored in e.g. \csname u8:α\endcsname for the character α. – user202729 Feb 15 '24 at 01:32
  • apart from the futurelet question why \let\@input#1 ??? This undefines \input not even within a group, did you intend that? – David Carlisle Feb 15 '24 at 14:38

0 Answers0