2

I understand string processing can be done on either LaTeX2e using xstring and expl3 using xparse (originally based on the 2013 thread "Tokenizing and parsing") I will be running a relatively simple string parsing macro hundreds of times in a document, probably over many compilations and many documents, and would like to know which method is more timewise and/or computationally efficient, discounting the fixed time to load the packages.

An approximation based on a brief look at xstring's code would be most welcome from the 3e team here. I don't need anything close to exact results - this is for my own curiosity and just to make sure I don't end up trying to scratch through a brick wall somewhere.

Also I'd be interested in opinions on whether this application is a useful comparison of 2e and 3e algorithms.

This is the macro I'm running using xstring: it is to give a very light highlight of the first letter (first two letters for this demo) of every mention of key words and phrases, which would simply clutter the page if fully highlighted:

\documentclass{article}
\usepackage{xstring}

%%% capitalize first n letters of each word %%%
% apply operator#3 to string#1 with word separators #2   
\newcommand{\Splitstrop}[3][ ]{%
  \providecommand\csA{}%
  \providecommand\csB{}%
  \StrCut{#3}{#1}\csA\csB%
  #2{\csA}%
  \ifx\csB\empty\else{#1}\Splitstrop[#1]{#2}{\csB}\fi\relax} 

% apply operation#3 on first #2 letters of string #1    
\newcommand{\Leftstrop}[3][1]{#2{\StrLeft{#3}{#1}}\StrGobbleLeft{#3}{#1}\relax}

\newcommand{\Kw}[1]{\textsc{#1}} %first mention of keyword/phrase
\newcommand{\Kwd}[1]{\Leftstrop[2]{\Kw}{#1}} %single-word 2nd mention: apply \Kw to first two letters of the word (I plan to sc just the first letter, but this is more illustrative)
\newcommand{\Kwds}[1]{\Splitstrop[ ]{\Kwd}{#1}} % keyphrase 2nd mention: apply Kwd to each word (space is word separator)


\begin{document}
Example: We define the \Kw{embiggen} \Kwd{operator} to raise the size of the \Kwd{argument} by one. 

This is a reminder not to \Kwds{beg the question} in your definitions (Sec. 2-1): don't say ``We define the \Kwd{embiggen} \Kwd{operator} as the \Kwd{operator} that \Kwd{embiggens}.''
\end{document}

xstring simple parse demo output

1 Answers1

5

I tried your example with l3benchmark:

\documentclass{article}
\usepackage{xstring,l3benchmark}

%%% capitalize first n letters of each word %%%
% apply operator#3 to string#1 with word separators #2   
\newcommand{\Splitstrop}[3][ ]{%
  \StrCut{#3}{#1}\csA\csB
  #2{\csA}%
  \ifx\csB\empty\else{#1}\Splitstrop[#1]{#2}{\csB}\fi\relax
}

% apply operation#3 on first #2 letters of string #1    
\newcommand{\Leftstrop}[3][1]{#2{\StrLeft{#3}{#1}}\StrGobbleLeft{#3}{#1}\relax}

\newcommand{\Kw}[1]{\textsc{#1}} %first mention of keyword/phrase
\newcommand{\Kwd}[1]{\Leftstrop[2]{\Kw}{#1}} %single-word 2nd mention: apply \Kw to first two letters of the word (I plan to sc just the first letter, but this is more illustrative)
\newcommand{\Kwds}[1]{\Splitstrop[ ]{\Kwd}{#1}} % keyphrase 2nd mention: apply Kwd to each word (space is word separator)

\ExplSyntaxOn
\cs_set_eq:NN \benchmark \benchmark:n
\ExplSyntaxOff


\begin{document}

\benchmark{
Example: We define the \Kw{embiggen} \Kwd{operator} to raise the size of the 
\Kwd{argument} by one. 

This is a reminder not to \Kwds{beg the question} in your definitions (Sec. 2-1): 
don't say ``We define the \Kwds{embiggen operator} as the \Kwd{operator} 
that \Kwd{embiggens}.''
}

\end{document}

and the expl3 version

\documentclass{article}
\usepackage{xparse,l3benchmark}

\ExplSyntaxOn
\NewDocumentCommand{\Kw}{m}
 {
  \kompootor_textsc:n { #1 }
 }
\NewDocumentCommand{\Kwd}{m}
 {
  \kompootor_split:Nnn \kompootor_textsc:n { 2 } { #1 }
 }
\NewDocumentCommand{\Kwds}{m}
 {
  \kompootor_splitstrop:nNn { ~ } \kompootor_textsc:n { #1 }
 }

\seq_new:N \l__kompootor_splitstrop_in_seq
\seq_new:N \l__kompootor_splitstrop_out_seq

\cs_new_protected:Nn \kompootor_splitstrop:nNn
 {
  \seq_set_split:Nnn \l__kompootor_splitstrop_in_seq { #1 } { #3 }
  \seq_set_map:NNn \l__kompootor_splitstrop_out_seq \l__kompootor_splitstrop_in_seq
   { \kompootor_leftstrop:Nn #2 { ##1 } }
  \seq_use:Nn \l__kompootor_splitstrop_out_seq { #1 }
 }
\cs_new_protected:Nn \kompootor_leftstrop:Nn
 {
  \kompootor_split:Nnn #1 { 2 } { #2 }
 }
\cs_new:Nn \kompootor_split:Nnn
 {
  #1 { \tl_range:nnn { #3 } { 1 } { #2 } }
  \tl_range:nnn { #3 } { #2+1 } { -1 }
 }
\cs_new_protected:Nn \kompootor_textsc:n { \textsc { #1 } }

\NewDocumentCommand{\benchmark}{+m}{\benchmark:n{#1}}
\ExplSyntaxOff

\begin{document}

\benchmark{
Example: We define the \Kw{embiggen} \Kwd{operator} to raise the size of the 
\Kwd{argument} by one. 

This is a reminder not to \Kwds{beg the question} in your definitions (Sec. 2-1): 
don't say ``We define the \Kwds{embiggen operator} as the \Kwd{operator} 
that \Kwd{embiggens}.''
}

\end{document}

The result is

0.00338 seconds (1.1e4 ops)

for the xstring implementation and

0.00187 seconds (6.29e3 ops)

for the expl3 implementation.

The output is identical (removing \benchmark):

xstring version

enter image description here

expl3 version

enter image description here

egreg
  • 1,121,712
  • Oops, sorry for not including the expl3 code myself - I confess I forgot I hadn't finished that code before posting. Thank you for doing the work yourself though. I'll play with l3benchmark and I think going forward I'll be looking to see if I run into any tricks where L2e stuff runs significantly faster still than expl3 -- and I'll be sure to post it. – Kompootor Aug 03 '19 at 23:05
  • Wait... your code doesn't need xparse at all, does it? – Kompootor Aug 04 '19 at 23:32
  • @Kompootor What do you mean? It certainly needs it.. – egreg Aug 05 '19 at 07:39
  • For whatever reason I had originally assumed I would need splitlist when all the needed string commands are in l3seq. For the basic commands being declared L2e syntax could have been used instead of xparse, right? I mean besides defeating the comparative purpose of this exercise, it's all compatible, right? – Kompootor Aug 06 '19 at 18:17
  • @Kompootor expl3 has many useful features and you don't need packages such as splitlist. You can use \newcommand, I would not encourage it. – egreg Aug 06 '19 at 19:20