9

I have a large (approx 600 pages) index of files, and for each file entry significant computation is required so that additional links can be added. The computation of these links takes quite a bit of time, and as each file name appears in multiple places in the index these links are repeatedly generated many times during indexing.

My current approach caches the data associated with each link in a def, but even with that it takes about 6 hours to run, and this will increase over time. My thought was to cache this output in a \savebox, but based on Savebox equivalent of \csname?, it appears that this is not a good approach.

Main Question:

If I should not be using \saveboxs, how else can I achieve similar functionality where I cache the output (at least for a few thousand different cases)? For instance, suppose that the macro \MyComplicatedCode was much more complicated in the MWE below. What can I do outside of adjusting \MyComplicatedCode and be able to reuse the output without recomputing it?

  • Even if I fully understood expansion, I think it would be a lot of work to attempt to expand the code so that I store only the final links to be added.

  • Typesetting the code into a .pdf won't work either as that code generated links to other files.

Sub Questions:

  • What is the limit on the number of \saveboxs allowed?
  • Can these be increased by changing some parameter?

Notes:

  • I would currently need about 1,500 \saveboxs and probably will increase to about 5,000 over time.

References:

Code:

Since using a \savebox is not a good idea, this MWE is not that important. In the cooked up example below I am caching the name of the file extracted from a given hierarchical file name. The example works fine if I create a \def, but I can't seem to be able to achieve the same functionality with a \savebox. This currently yields:

enter image description here

\documentclass{article}
\usepackage{pgffor}
\usepackage{xstring}

\newcommand*{\MyFileNames}{% ../../DirNameA/FileA1.tex,% ../../DirNameA/FileA2.tex,% ../../DirNameB/FileB1.tex,% ../../DirNameB/FileB2.tex% }%

\newcommand*{\MyComplicatedCode}[4][1]{% \StrBehind[#1]{#2}{#3}[#4]% }%

\begin{document} %% Create \newcommands \foreach \X in \MyFileNames {% \MyComplicatedCode[3]{\X}{/}{\ExtractedFileName}% \expandafter\xdef\csname MyCommand\X\endcsname{File Name = \ExtractedFileName}% %\par\string\MyCommand\X: \expandafter\csname MyCommand\X\endcsname }%

%% Now use the \newcommands Command ../../DirNameB/FileB1.tex: \expandafter\csname MyCommand../../DirNameB/FileB1.tex\endcsname

\bigskip \foreach \X in \MyFileNames {% \MyComplicatedCode[3]{\X}{/}{\ExtractedFileName}% \expandafter\newsavebox\csname MySaveBox\X\endcsname% \expandafter\savebox\csname MySaveBox\X\endcsname{Savebox Value = \ExtractedFileName}% }%

%% Now use the \savebox Savebox ../../DirNameB/FileB1.tex: \expandafter\usebox\csname MySaveBox../../DirNameB/FileB1.tex\endcsname%

\end{document}

Peter Grill
  • 223,288
  • classic tex has 256 box registers, etex has 32768 so 5000 box registers isn't actually a problem – David Carlisle Nov 04 '13 at 00:06
  • 2
    A quick idea that jumps to mind is somehow "outsourcing" at least parts of the long-running computation to Lua code, or any script invoked with shell-escape on. TeX wasn't designed for this. – marczellm Nov 04 '13 at 00:15
  • 1
    @DavidCarlisle: Hmmm.. Seems as if then a solution to the MWE would be worthwhile. – Peter Grill Nov 04 '13 at 02:08

1 Answers1

11

You need to expand \X before passing it to the string splitting function, and need to make the box assignment global if using the looping construct you used.

\documentclass{article}
\usepackage{pgffor}
\usepackage{xstring}

\newcommand*{\MyFileNames}{%
    ../../DirNameA/FileA1.tex,%
    ../../DirNameA/FileA2.tex,%
    ../../DirNameB/FileB1.tex,%
    ../../DirNameB/FileB2.tex%
}%

\newcommand*{\MyComplicatedCode}[4][1]{%
    \StrBehind[#1]{#2}{#3}[#4]%
}%


\begin{document}
%% Create \newcommands
\foreach \X in \MyFileNames {%
    \MyComplicatedCode[3]{\X}{/}{\ExtractedFileName}%
    \expandafter\xdef\csname MyCommand\X\endcsname{File Name = \ExtractedFileName}%
    %\par\string\MyCommand\X: \expandafter\csname MyCommand\X\endcsname
}%

%% Now use the \newcommands
Command ../../DirNameB/FileB1.tex:
\expandafter\csname MyCommand../../DirNameB/FileB1.tex\endcsname


\bigskip
\foreach \X in \MyFileNames {%
    \edef\tmp{\noexpand\MyComplicatedCode[3]{\X}{/}{\noexpand\ExtractedFileName}}%
    \tmp
    \expandafter\newsavebox\csname MySaveBox\X\endcsname%
    \global\expandafter\setbox\csname MySaveBox\X\endcsname\hbox{Savebox Value = \ExtractedFileName}%
}%

%% Now use the \savebox
Savebox %../../DirNameB/FileB1.tex:
\expandafter\usebox\csname MySaveBox../../DirNameB/FileB1.tex\endcsname%

\end{document}

Although probably you should just use macros rather than boxes for this. Or you could use a single box, it depends on how they are to be used. Using a separate register for each is slightly profligate, but it makes it easy to access individual entries in arbitrary order. If in fact you are going to process all the items in sequence, you could just store them in a single box. (You could store them in a single box anyway but then accessing items other than the first or last can get slow)

Peter Grill
  • 223,288
David Carlisle
  • 757,742