5

I need to write a command that searches a file for a string that matches a certain regexp, and inserts that string. I know how to parse a string of text using a regex with

\usepackage{l3regex}
\ExplSyntaxOn
\cs_new_protected:Npn \mycom #1
  {
    \tl_set:Nn \l_tmpa_tl {#1}
    \regex_replace_all:nnN { THE-REGEXP } { THE-REPLACEMENT } \l_tmpa_tl
    \tl_use:N \l_tmpa_tl
  }
\ExplSyntaxOff

but I don't know how I can use it to parse the contents of a file. I tried replacing {#1} with {\input{#1}}, but it didn't work (the \input was simply considered part of the string).

Sean Allred
  • 27,421
Malabarba
  • 6,972
  • Take a look at the catchfile package to allow you to read an entire file to a macro, or iterate over the file line-by-line with \ior_map_inline:Nn (experimental function). – Joseph Wright Apr 25 '12 at 07:58

1 Answers1

7

There is no facility, at the moment, for storing into a token list the contents of a file, but you can still use the catchfile package:

\documentclass{article}
\usepackage{catchfile}
\usepackage{expl3,l3regex}
\ExplSyntaxOn
\cs_new_protected:Npn \mycom #1
  {
    \CatchFileDef \l_tmpa_tl {#1} {}
    \regex_replace_all:nnN { xrep } { foo } \l_tmpa_tl
    \tl_use:N \l_tmpa_tl
  }
\ExplSyntaxOff

\mycom{xrep}

Update

In the revision dated 2014-06-25, the functionality has been added to expl3:

\documentclass{article}
\usepackage{expl3,l3regex,xparse}
\ExplSyntaxOn

\NewDocumentCommand{\mycom}{m}
 {
  \malabarba_mycom:n { #1 }
 }

\tl_new:N \l_malabarba_mycom_content_tl
\cs_new_protected:Npn \malabarba_mycom:n #1
  {
   \tl_set_from_file:Nnn \l_malabarba_mycom_content_tl {} {#1}
   \regex_replace_all:nnN { xrep } { foo } \l_malabarba_mycom_content_tl
   \tl_use:N \l_malabarba_mycom_content_tl
  }
\ExplSyntaxOff

\mycom{xrep}

Note that there's a difference between \CatchFileDef and \tl_set_from_file:Nnn: in the former command the trailing argument contains setup instructions to be performed (locally) before loading the file, in the latter these setup tokens should go in the second argument

\CatchFileDef<command name>{<filename>}{<setup>}

\tl_set_from_file:Nnn <tl variable> { <setup> } { <filename> }

The analog of \CatchFileEdef is called

\tl_set_from_file_x:Nnn
egreg
  • 1,121,712
  • Thanks, that's perfect. Any chance I might get LaTeX to interpret the output? =) (instead of just displaying it verbatim) – Malabarba Apr 25 '12 at 10:07
  • @BruceConnor I get what's expected. Probably what you get wrong depends on the "replacement" regex you're using. – egreg Apr 25 '12 at 10:14
  • What do you mea by that? The regexp does output what I expect it to ({\bf Test} for example). The problem is that this output does't get iterpreted by LaTeX (so my pdf would contain a literal {\bf Test} istead of a Test). Is that not the expected behaviour? – Malabarba Apr 25 '12 at 10:23
  • @BruceConnor If the search regex is something like ([^\*]*?)\*\* and the replacement regex is \c{textbf}\cB\{\1\cE\} I get that **Test** is replaced by \textbf{Test} and interpreted correctly. – egreg Apr 25 '12 at 10:29
  • If I copy/paste your answer's preamble to a new document, replace xrep with .* and foo with \c{textbf}\cB\{ Hi \cE\}, create a file in the same folder cotaining **TEST**, and call \mycom{filename} inside begin/end{document} I get the bizarre: c–textbf ̋cB–HicE ̋c–textbf ̋cB–HicE ̋ – Malabarba Apr 25 '12 at 11:23
  • I guess that probably means I've hit a bug. =/ – Malabarba Apr 25 '12 at 11:24
  • This bizarre result actually only hapenned in the new document. If I insert this code into my existent document I just get c{textbf}cB{HicE}c{textbf}cB{HicE}, which basically means it's not getting interpreted right (though I have no idea why they behave different). – Malabarba Apr 25 '12 at 11:28
  • @BruceConnor You probably have an outdated version of l3regex; mine is l3regex.dtx 3488 2012-03-03 19:49:03Z – egreg Apr 25 '12 at 11:29
  • @BruceConnor: It would be very helpful if you can update l3regex and check again. If the problem persists, please let me know (I wrote l3regex). – Bruno Le Floch May 06 '12 at 09:28
  • @BrunoLeFloch I'll try that next week (how do I check my current version?). For now I have a thesis to defend, so I'm in a bit of a hurry. =) – Malabarba May 06 '12 at 14:03
  • @BruceConnor good luck for your defense. Can you compile the document in egreg's answer, with xrep replaced by .* and foo replaced by \c{textbf}\cB\{ Hi \cE\}? You'd probably need to add an \begin{document}...\end{document} too. To get the file versions you can add \listfiles at the very start of your document: the info appears at the end of hte log file. – Bruno Le Floch May 20 '12 at 21:46
  • @BrunoLeFloch Ok. My l3regex was indeed outdated (from about 8 months ago). I've updated it and the problem was solved. – Malabarba May 22 '12 at 17:22