5

To learn more about Plain TeX, I try to define a special verbatim command \verb{code}, which puts code inside braces.

When there is no braces in code, It's easy. But when there are some balanced braces inside code, it seems hard to define this command.

Furthermore, there may be some \{ or \} inside code. How could I define this?

For example, I want \verb{a{b\}c\{d}e} to produce a{b\}c\{d}e.

Z.H.
  • 3,346
  • 21
  • 39
  • Due to the way TeX reads tokens, it's difficult to do this in a command. You might want to take a look at \tt\makeatletter\meaning\@sverb (just typeset that). It's not that practically helpful, but it shows the complexity. You have to change catcodes, so taking in the argument (as you would naturally do) doesn't work since everything is already read by that time. You have to take in a delimiting argument, then change catcodes accordingly. Then you can do what you need. – Sean Allred Jun 03 '14 at 05:39
  • 1
    @SeanAllred You have to change the catcodes before grabbing the delimited argument ;-) – Joseph Wright Jun 03 '14 at 06:08
  • @Z.H. While there are approaches that would allow the interface you suggest, they are pretty complex (we do something related in the LaTeX xparse package for grabbing optional arguments, and tracking nested brackets is tricky). As already suggested, a much easier approach is to use some think \verb"a{b\}c\{d}e" where the very first character after\verb` is used as both the start and end marker. Would a solution working in this way be acceptable? – Joseph Wright Jun 03 '14 at 06:14
  • @JosephWright I know how to define \verb just like it is in LaTeX. I'm just curious about more. – Z.H. Jun 03 '14 at 06:30
  • I had proposed an answer with \detokenize, but deleted it after Joseph pointed out your focus was plain TeX. If you want the answer resurrected, just let me know. – Steven B. Segletes Jun 03 '14 at 10:25
  • You'll have problems when there are *unbalanced* braces. It's impossible to grab an argument with unbalanced braces, of course: you'd have to tell in some way which one is the final. All in-line verbatim macros that allow braces in the text to be typeset verbatim use the convention of delimiting the text with a character not used in the text itself. You can find good definitions of in-line verbatim macros in the TeXbook and in TeX by Topic. – egreg Jun 03 '14 at 10:30
  • @egreg Yes, but since almost all of the time we just put balanced braces inside code, Let's assume braces are always balanced and don't consider unbalanced case. – Z.H. Jun 03 '14 at 11:02
  • 1
    it's easier to come up with a \begin ... \end or a "toggle" approach than a command-with-argument definition. you might find the definitions of \verbatim and |...| in the file tugboat.sty (the plain tugboat implementation) interesting. these extended the texbook definitions to pay attention to spaces at the beginning of a line, and an "escape" to allow metacode within a verbatim block, among other features. this file (and related material) is in tex live in the .../plain/tugboat-plain/ area. – barbara beeton Jun 03 '14 at 15:09

2 Answers2

7

A plain TeX solution without e-TeX.

  1. The argument is read with verbatim category codes (12). Plain TeX provides a list of special characters in \dospecials.
  2. The argument is surrounded by curly braces. Therefore the curly braces gets their usual category codes. Then the argument may contain balanced curly braces. However it is not possible to escape a curly brace with a backslash, because the backslash has category code 12.

  3. The argument is read and stored in a macro \verbText.

  4. \meaning prints the macro definition. The characters have the category code 12 (other) and the space character has its normal category code 10 (space).

  5. Font \tentt has a special space character, but the space needs a different catcode. This is done by macro \ConvertSpacesToCatcodeOther.

Example file:

\def\verb{%
  \begingroup
  % verbatim catcodes
  \def\do##1{\catcode`##1=12 }%
  \dospecials
  % except curly braces
  \catcode`\{=1 %
  \catcode`\}=2 %
  \verbAux
}
\def\verbAux#1{%
  \def\verbText{#1}%
  \edef\verbText{%
    \expandafter\StripPrefix\meaning\verbText
  }%
  \tentt
  \FirstOfOne{\expandafter\ConvertSpacesToCatcodeOther\verbText} \NIL
  \endgroup
}
\def\StripPrefix#1>{}
\long\def\FirstOfOne#1{#1}
\def\ConvertSpacesToCatcodeOther#1 #2\NIL{%
  #1%
  \def\temp{#2}%
  \ifx\temp\empty
    \expandafter\Gobble
  \else
    \SpaceOther
    \expandafter\FirstOfOne
  \fi
  {\ConvertSpacesToCatcodeOther#2\NIL}%
}
\long\def\Gobble#1{}
\begingroup
  \lccode`\9=32 % space
\lowercase{\endgroup
  \def\SpaceOther{9}%
}

\verb{a{b\}c\{d}e  with spaces}

\bye

Result

Escaping

Also the escaping of the curly braces with a backslash can be implemented. If the backslash has its escape category code (0), then command names are tokenized as usual. \meaning will add a space after command names that consists of characters with letter catcodes (11). \abc would become \abc␣. This can be avoided by changing the category codes for all letters:

\def\verb{%
  \begingroup
  % verbatim catcodes
  \def\do##1{\catcode`##1=12 }%
  \dospecials
  % except curly braces and backslash
  \catcode`\\=0 %
  \catcode`\{=1 %
  \catcode`\}=2 %
  \count255=`\A %
  \loop
    \catcode\count255=12 %
  \ifnum\count255<`\Z %
    \advance\count255 by 1 %  
  \repeat
  \count255=`\a
  \loop
    \catcode\count255=12 %
  \ifnum\count255<`\z %
    \advance\count255 by 1 %  
  \repeat
  \verbAux
}
\def\verbAux#1{%
  \def\verbText{#1}%
  \edef\verbText{%
    \expandafter\StripPrefix\meaning\verbText
  }%
  \tentt
  \FirstOfOne{\expandafter\ConvertSpacesToCatcodeOther\verbText} \NIL
  \endgroup
}
\def\StripPrefix#1>{}
\long\def\FirstOfOne#1{#1}
\def\ConvertSpacesToCatcodeOther#1 #2\NIL{%
  #1%
  \def\temp{#2}%
  \ifx\temp\empty
    \expandafter\Gobble
  \else
    \SpaceOther
    \expandafter\FirstOfOne
  \fi
  {\ConvertSpacesToCatcodeOther#2\NIL}%
}
\long\def\Gobble#1{}
\begingroup
  \lccode`\9=32 % space
\lowercase{\endgroup
  \def\SpaceOther{9}%
}

\verb{a{b\}c\{d}e  with spaces \abc\} \{}

\bye

Result with escaping

Heiko Oberdiek
  • 271,626
6

Here is more simple solution:

\newcount\tmpnum
\def\catcodeletters{\tmpnum=64
  \loop \advance\tmpnum by1 
     \ifnum\tmpnum<128
        \ifnum\catcode\tmpnum=11 \catcode\tmpnum=12 \fi
     \repeat 
}
\def\verb{\bgroup\catcode`\%=12\catcode`\^=13\catcode`\ =12\catcode`\#=12
   \catcodeletters\verbA}
\def\verbA#1{\def\tmp{#1}\tt\expandafter\mm\meaning\tmp\endmm\tmp\egroup}
\def\mm#1->#2\endmm{\def\tmp{#2}}

\verb{a{b\}c\{d}e a  #sp_&ace}

\end

I've added the \catcodeletters to my macro in order to spaces after \foo don't arise. Thanks to Heiko.

Edit: My 12-lines macro was not chosen as an accepted answer but 50-line macro was. May be that the spacemarks replaced by real spaces was the core of the interest, but it was not specified in the question. This is a part of printing, no scanning verbatim. So it can be implemented after \tmp is scanned. You can change \verbA definition and you can add the \printverb macro, which does the space replacement by another way than in the previous answer.

\def\verbA#1{\def\tmp{#1}\expandafter\mm\meaning\tmp\endmm
   \tt\expandafter\printverb\tmp\end\egroup}

\def\printverb{\futurelet\next\printverbA}
\def\printverbA{%
  \ifx\next\end \def\next{\let\next}%
  \else \ifx\next\spacetoken \char32
            \def\next{\afterassignment\printverb\let\next= }%
        \else \next \def\next{\afterassignment\printverb\let\next}%
  \fi   \fi
  \next
} 
\edef\tmp{\let\noexpand\spacetoken= \space}\tmp

Now the \tt\char32 spacemarks are printed instead spaces.

wipet
  • 74,238
  • 1
    I'd add also \catcode`\\=12 or control sequences will get a space after them. – egreg Jun 03 '14 at 11:07
  • @egreg OK, but the \catcode``\\=12 brings new problem: the \{ and \} sequences cannot be written arbitrary out of matching braces... – wipet Jun 03 '14 at 11:11