2

I'm trying to look for a way to determine if a particular text or string is present in external .tex files. It is like accepting a text input from a user and determining if that text can be found in your .tex files. For example, you may want to search for a title of an article, if it is present in your .tex or .bib files. Consider the following MWE.

%MWE
\documentclass{article}
%this is a temporary definition, just to declare the command.
\newcommand*{\mySearchforStringinExternalFilesCommand}[4]{#1#2#3#4}
%first argument is the text/string to be searched.
%second argument is a list of external files that will be searched if the given string is present.
%third argument is the output if the string is found.
%fourth argument is the output if the stsring is not found.

\begin{document}

%I am trying to search for the text vibration'' if it is present inside the external files datafileone.tex, datafiletwo.tex, and datafilethree.tex. %If the text can be found inside the external files, thenThe search phrase ... was found in ... '' will be printed in the pdf file together with the filename/s of the external file/s where the text was found. If not, then The search phrase was not found in any of the datafiles.'' will be printed. %datafileone.tex contains the phraseamplitude of vibration''. %datafiletwo.tex contains the phrase frequency of vibration''. %datafilethree.tex contains the phraseinstantaneous frequency and instantaneous amplitude''. (these are terms from my thesis :) ) %Therefore, if the macro \mySearchforStringinExternalFilesCommand is designed properly, it must output ``The search phrase `vibration' was found in datafileone.tex and datafiletwo.tex''

\mySearchforStringinExternalFilesCommand% {vibration}%This is the text or string to be searched in the given external files. {%These are the external files. datafileone.tex% datafiletwo.tex% datafilethree.tex% }% {The search phrase ... was found in ...}%datafileone.tex and/or datafiletwo.tex and/or datafilethree.tex {The search phrase was not found in any of the datafiles.}%This is printed if the text is not found.

\end{document}

Phelype Oleinik used expl3 syntax and proposed the command \replacelineonce{<file>}{<search string>}{<replacement>}{<true code>}{<false code>} for find and replace, found in How to replace a line in a file written by TeX' \write command. The code is as follows (taken from the URL)

\documentclass{article}
\usepackage{xparse}
\ExplSyntaxOn
\NewDocumentCommand \replacelineonce { m m m m m }
  { \mountain_replace_once:nnnTF {#1} {#2} {#3} {#4} {#5} }
\NewDocumentCommand \replacelineall { m m m m m }
  { \mountain_replace_all:nnnTF {#1} {#2} {#3} {#4} {#5} }
\tl_new:N \l__mountain_tmpa_tl
\tl_new:N \l__mountain_file_seq
\bool_new:N \l__mountain_replaced_bool
\ior_new:N \l__mountain_replace_ior
\iow_new:N \l__mountain_replace_iow
\prg_new_protected_conditional:Npnn \mountain_replace_once:nnn #1 #2 #3 { T, F, TF }
  { \__mountain_replace_aux:Nnnn \c_false_bool {#1} {#2} {#3} }
\prg_new_protected_conditional:Npnn \mountain_replace_all:nnn #1 #2 #3 { T, F, TF }
  { \__mountain_replace_aux:Nnnn \c_true_bool {#1} {#2} {#3} }
\cs_new_protected:Npn \__mountain_replace_aux:Nnnn #1 #2 #3 #4
  {
    \ior_open:NnTF \l__mountain_replace_ior {#2}
      { \__mountain_replace_line:Nnnn #1 {#3} {#4} {#2} }
      {
        \msg_error:nnn { mountain } { file-not-found } {#2}
        \prg_return_false:
      }
  }
\cs_new_protected:Npn \__mountain_replace_line:Nnnn #1 #2 #3 #4
  {
    \seq_clear:N \l__mountain_file_seq
    \bool_set_false:N \l__mountain_replaced_bool
    \ior_str_map_inline:Nn \l__mountain_replace_ior
      {
        \str_if_eq:nnTF {##1} {#2}
          {
            \bool_set_true:N \l__mountain_replaced_bool
            \seq_put_right:Nn \l__mountain_file_seq {#3}
            \bool_if:NF #1
              { \ior_map_break:n { \__mountain_replace_skip: } }
          }
          { \seq_put_right:Nn \l__mountain_file_seq {##1} }
      }
    \__mountain_replace_end:n {#4}
  }
\cs_new_protected:Npn \__mountain_replace_skip:
  {
    \ior_str_map_inline:Nn \l__mountain_replace_ior
      { \seq_put_right:Nn \l__mountain_file_seq {##1} }
  }
\cs_new_protected:Npn \__mountain_replace_end:n #1
  {
    \ior_close:N \l__mountain_replace_ior
    \iow_open:Nn \l__mountain_replace_iow {#1}
    \seq_map_inline:Nn \l__mountain_file_seq
      { \iow_now:Nn \l__mountain_replace_iow {##1} }
    \iow_close:N \l__mountain_replace_iow
    \bool_if:NTF \l__mountain_replaced_bool
      { \prg_return_true: }
      { \prg_return_false: }
  }
\msg_new:nnn { mountain } { file-not-found }
  { File~`#1'~not~found. }
\ExplSyntaxOff

\begin{document}

\newwrite\tempfile \immediate\openout\tempfile=lists.tex \immediate\write\tempfile{line1} \immediate\write\tempfile{} \immediate\write\tempfile{line2} \immediate\write\tempfile{} \immediate\write\tempfile{line2} \immediate\write\tempfile{} \immediate\write\tempfile{line2} \immediate\closeout\tempfile

\replacelineonce{lists.tex}{line2}{line replaced} {Replaced once:} {Nothing replaced:}

\input{lists} \bigskip

\replacelineall{lists.tex}{line2}{line replaced} {Replaced all:} {Nothing replaced:}

\input{lists} \bigskip

\replacelineonce{lists.tex}{line2}{line replaced} {Replaced once:} {Nothing replaced:}

\input{lists} \bigskip

\end{document}

My interest, though, is only to “find”, and not “find and replace”.

Another similar topic is Find and replace in a document consisting of many 'included' files using \include

Kindly seeking your help.

egreg
  • 1,121,712
  • datafileone.tex% should be datafileone.tex (without %) so there will be a space between the filenames. same with datafiletwo.tex% which must be datafiletwo.tex and datafilethree.tex% which must be datafilethree.tex – beethovengg14 Jul 28 '21 at 16:49

2 Answers2

2

EDITED to overcome limitations on input characters of certain catcodes. Note in datafileone.tex, the word vibration is part of an argumented definition. In datafiletwo.tex, the word vibration is part of a comment, which is also searched unless you comment out a particular line in the macro definition. The file datafilefour.tex was added to provide a case where search terms are not to be found.

WARNING: In the current incarnation of the readarray package (which will be remedied in a future update), end-of-lines are always discarded during a \readdef and replaced with the value of \readarraysepchar, which is not the natural LaTeX way of reading end-of-lines. This could affect searches where the search string spans multiple lines of input.

Arguments #3 and #4 of \mySearchforStringinExternalFilesCommand are expected to, themselves, take 2 and 1 arguments, respectively, regardless of whether they do anything with them. In the case of #3 the two arguments passed include the search string, and the filename where the match was found. In the case of #4, the argument passed is the search string.

EDITED to demonstrate an OR search, where multiple search strings can be simultaneously specified, with the listofitems OR comparitor ||, as in vibration||frequency.

\begin{filecontents*}[overwrite]{datafileone.tex}
\today \def\mashit#1{\textit{amplitude #1 of vibration}}
\end{filecontents*}
\begin{filecontents*}[overwrite]{datafiletwo.tex}
frequency of something% REMEMBER TO CALL IT vibration
\end{filecontents*}
\begin{filecontents*}[overwrite]{datafilethree.tex}
instantaneous frequency and instantaneous amplitude
\end{filecontents*}
\begin{filecontents*}[overwrite]{datafilefour.tex}
none of the above
\end{filecontents*}

\documentclass{article} \usepackage[T1]{fontenc} \usepackage{readarray,listofitems} \readarraysepchar{ }% CURRENT readarray VERSION WILL INSERT THIS % AUTOMATICALLY AFTER EACH INPUT RECORD IS READ (EVEN IF RECORD ENDS % ON A MACRO OR %') \def\killcats{% \catcode#=12 \catcode\%=12 % COMMENT TO AVOID SEARCH OF COMMENTS \catcode\=12 \catcode\{=12 \catcode}=12 } \def\restorecats{% \catcode\\=0 \catcode}=2 \catcode\{=1 \catcode%=14 \catcode`#=6 }%

\newcommand{\mySearchforStringinExternalFilesCommand}[4]{% \def\findstatus{F}% \setsepchar{,}% \readlist\filelist{#2}% \setsepchar{#1}% \foreachitem\z\in\filelist[]{% \killcats \expandafter\readdef\expandafter{\z}\tmpfile \restorecats \readlist\searchlist{\tmpfile}% \ifnum\searchlistlen>1\relax#3{#1}{\z}\def\findstatus{T}\fi } \if F\findstatus #4{#1}\fi } \newcommand\searchtrue[2]{The search phrase #1'' was found in #2.\par} \newcommand\searchfalse[1]{The search phrase#1'' was not found in any of the datafiles.\par} \begin{document} \mySearchforStringinExternalFilesCommand% {vibration} {datafileone.tex, datafiletwo.tex, datafilethree.tex, datafilefour.tex} {\searchtrue}{\searchfalse}

\bigskip \mySearchforStringinExternalFilesCommand% {frequency} {datafileone.tex, datafiletwo.tex, datafilethree.tex, datafilefour.tex} {\searchtrue}{\searchfalse}

\bigskip \mySearchforStringinExternalFilesCommand% {vibration||frequency} {datafileone.tex, datafiletwo.tex, datafilethree.tex, datafilefour.tex} {\searchtrue}{\searchfalse} \end{document}

enter image description here

  • thank you sir @stevenbsegletes. will study your solution – beethovengg14 Jul 28 '21 at 18:25
  • 1
    your solution has taken another level higher -- searching within comments. this is nice sir – beethovengg14 Jul 28 '21 at 18:36
  • can't imagine my latex codes without comments – beethovengg14 Jul 28 '21 at 18:39
  • 1
    @beethovengg14 And here is a bonus! You can search for multiple things at once, separating the search field with || (the listofitems OR comparitor), as in argument #1 being set as vibration||amplitude. – Steven B. Segletes Jul 28 '21 at 19:05
  • 1
    @beethovengg14 Please see my EDIT for a demonstration of this. – Steven B. Segletes Jul 28 '21 at 19:11
  • || is for OR. is there something for AND, NOR, NAND, like those in logic gates? :) this is really nice, searching for multiple keywords – beethovengg14 Jul 29 '21 at 07:39
  • 1
    @beethovengg14 Sorry, only OR. The underlying purpose of the listifitems package is to break a delimited input string into a list array. Typically, a delimiter might be a comma, but maybe you want it to be a comma OR a semicolon. AND makes no sense in this context, even though the way I use the package here is to use the word "vibration" as a delimiter. If the string contains this "delimiter" n times, the list will have n +1 elements. If it does not contain the delimiter, the list only has one element. I check the list length to determine the presence of the delimiter. – Steven B. Segletes Jul 29 '21 at 08:53
  • there is an extra "}" (closing brace) after \end{document}? maybe typo only – beethovengg14 Aug 15 '21 at 06:31
  • 1
    @beethovengg14 Thank you. Fixed! – Steven B. Segletes Aug 15 '21 at 14:46
1

You can store the file in a token list variable and do the check.

The third argument to \lookfortextinfiles is a template where #1 stands for the file name with a match. The fourth argument, optional, is what to do in case of no match. I assume that you want to do something like \input{#1} or \bibliography{#1} if the file is a .bib file rather than just printing the list of file names with matches.

\begin{filecontents*}{\jobname-one.tex}
amplitude of vibration
\end{filecontents*}

\begin{filecontents}{\jobname-two.tex} frequency of vibration \end{filecontents}

\begin{filecontents}{\jobname-three.tex} instantaneous frequency and instantaneous amplitude \end{filecontents}

\documentclass{article}

\ExplSyntaxOn

\NewDocumentCommand{\lookfortextinfiles}{m m +m +O{}} {% #1 = text to look for % #2 = list of files to look into % #3 = template % #4 = what to do in case of no match \beethovengg_lookfor:nnnn { #1 } { #2 } { #3 } { #4 } }

\tl_new:N \l__beethovengg_lookfor_file_tl \seq_new:N \l__beethovengg_lookfor_match_seq

\cs_new_protected:Nn \beethovengg_lookfor:nnnn { \seq_clear:N \l__beethovengg_lookfor_match_seq \clist_map_inline:nn { #2 } { \file_get:nnN { ##1 } { } \l__beethovengg_lookfor_file_tl \tl_if_in:VnT \l__beethovengg_lookfor_file_tl { #1 } { \seq_put_right:Nn \l__beethovengg_lookfor_match_seq { ##1 } } } \cs_set:Nn __beethoven_lookfor_use:n { #3 } \seq_map_function:NN \l__beethovengg_lookfor_match_seq __beethoven_lookfor_use:n \seq_if_empty:NT \l__beethovengg_lookfor_match_seq { #4 } }

\ExplSyntaxOff

\begin{document}

\lookfortextinfiles{vibration}{ \jobname-one, \jobname-two, \jobname-three, }{Found in #1\par}[Found in no file]

\bigskip

\lookfortextinfiles{frequency}{ \jobname-one, \jobname-two, \jobname-three, }{Found in #1\par}[Found in no file]

\bigskip

\lookfortextinfiles{foo}{ \jobname-one, \jobname-two, \jobname-three, }{Found in #1\par}[Found in no file]

\end{document}

enter image description here

egreg
  • 1,121,712
  • thank you sir @egreg. will study your solution. – beethovengg14 Jul 28 '21 at 18:25
  • hoping that LaTeX 3 (expl3) will soon be released so i can study further the strange-looking commands :) – beethovengg14 Jul 28 '21 at 18:31
  • 1
    @beethovengg14 expl3 is loaded as part of the LaTeX2e kernel – Joseph Wright Jul 28 '21 at 19:14
  • oh i see. thank you sir @JosephWright. i saw the expl3 documentation. it will take some time for me to digest it :) there are now underscores and colons in variable names – beethovengg14 Jul 29 '21 at 07:37
  • ever since i "shifted" to LaTeX since 2003, everything has been so logical, yet difficult to program at first. – beethovengg14 Jul 29 '21 at 07:45
  • hi sir @egreg, "I assume that you want to do something like \input{#1} or \bibliography{#1} if the file is a .bib file rather than just printing the list of file names with matches." this is not what I meant, but it is good idea though. – beethovengg14 Aug 15 '21 at 06:51
  • hi sir @egreg, i would like to thank you very much for your wonderful solution. it worked. it is simple/short, yet complicated (i still have to study/digest the meaning of all those expl3 syntax :( ) i guess im always after not just the solution but also understanding. – beethovengg14 Aug 15 '21 at 07:38
  • hi sir @egreg, kindly seeking your help again. https://tex.stackexchange.com/questions/611605/appending-text-to-each-line-of-an-external-file-using-expl3-syntax – beethovengg14 Aug 22 '21 at 08:55