14

How can I format a verbatim paragraph? I.e. break, fill and join input lines to produce globally balanced output with the lengths of each line approaching the target \textwidth as closely as possible.

Using the listings package I can break lines, but I cannot join short lines.. That is, the following

\documentclass{article}
\usepackage{listings} 
\lstnewenvironment{exampleA}{\lstset{% 
  language=,
  basicstyle=\ttfamily, 
  breaklines=true,
  prebreak=+,
  postbreak=->,
  columns=fullflexible,
  breakindent=5pt} }{} 
\begin{document}
\begin{exampleA}
Long line Long line Long line Long line Long line Long line Long line Long line Long line Long line Long line Long line Long line Long line Long line Long line Long line 
Short Line 
Short Line 
Short Line 
\end{exampleA}
\end{document}

produces enter image description here

What I would like to achieve is similar to what LaTeX does by default with paragraphs.. That is, e.g.

\documentclass{article}
\newenvironment{exampleB}{\tt}{}
\begin{document}
\begin{exampleB}
Long line Long line Long line Long line Long line Long line Long line Long line Long line Long line Long line Long line Long line Long line Long line Long line Long line 
Short Line 
Short Line 
Short Line 
\end{exampleB}
\end{document}

will produce: enter image description here

The problem with using the latter approach is that I cannot use special symbols like # or % in the verbatim text.. So I need to use a verbatim environment like lstlisting or fancyvrb's Verbatim environment. (As mentioned: in the lstlisting environment, automatic line breaking is possible, however for the Verbatim environment this feature seems also to be missing)

EDIT: (This a comment to Frank's answer below)

Now there seems to be a problem with lines becoming too short.. The following input

\noindent\hrulefill
 \begin{myverbatim}
  xxxxxxx xxxx xxxxx: 1234567890123456789012345,
  123, 1234567890,
\end{myverbatim}
\noindent\hrulefill\

now produces

enter image description here

(I would now like to have the word "123" at the end of the first line, as there are certainly plenty of space for it there :))

Update

The above problem has now been solved, see comments in Frank's answer below..

2 Answers2

18

David did beat me by a couple of minutes, but this version here does indentation as requested and is not producing overfull lines (within reason):

\documentclass{article}

\makeatletter

% this defines myverbatim environment. to change name replace "myverbatim" in all places below (strctly speaing it is only necessary in some but ... :-)

\newdimen\outerparindent \def\myverbatim{% % fix for @noligs as the definition in LaTeX is swallowing any following space \def\do@noligs##1{% \catcode##1\active \begingroup \lccode~##1\relax \lowercase{\endgroup\def~{\leavevmode\kern\z@\char##1 }}}% % save the \parindent used outside \outerparindent\parindent % I'm lazy reusing existing setup of verbatim as much as possible, so obeylines is my way too hook in \def\obeylines{\rightskip=0pt plus 1fil \parindent=\outerparindent % if you like a defined \parindent value instead set it here \let\par@@par \leavevmode\indent}% % different definitions to handle spaces, select one: \def@xobeysp{\penalty\z@\char 32 \penalty\z@}% % this produces a visible space % \let@xobeysp\space % this version will drop spaces at linebreaks % \def@xobeysp{\penalty\z@\mbox{}\space\penalty\z@}% % this will keep spaces after line breaks % @verbatim @myverbatimescape@myverbatimnewline \frenchspacing@vobeyspaces@xmyverbatim}

\let\endmyverbatim\endverbatim

% setting up the behavior of end-off-line: on its one behave like a (special) space, % two in a row end a paragraph \begingroup \catcode\^^M=\active% \gdef\@myverbatimnewline{\catcode^^M=\active \let^^M@xmyverbatimnewline}% \gdef@xmyverbatimnewline{@ifnextchar ^^M{@myverbatimpar}{@xobeysp}}% \gdef@ymyverbatimnewline{@ifnextchar ^^M{@myverbatimpar}{}}% \gdef@myverbatimpar ^^M{\par% \vskip\baselineskip% % this line will generate an extra baselineskip per empty line % comment out if not wanted @ymyverbatimnewline}% % and this part is to get rid of the first ^^M after \begin{verbatim} but not any other \gdef@zmyverbatimnewline{@ifnextchar ^^M{@zmyverbatimpar}{}}% \gdef@zmyverbatimpar^^M{@ifnextchar ^^M{@myverbatimpar}{}}% \endgroup

\begingroup \catcode |=0 \catcode[= 1 \catcode]=2 \catcode{=12 \catcode \}=12 % to support \\ as a line break we need to make \ active and make a lot of extra horrible definitions: \catcode\=13 |gdef|@myverbatimescape[|catcode`||active|let|@myverbatimbslash] % using @ifnextchar is ok because "space" is active too so isn't gobbled in case of "\ " |gdef|@myverbatimbslash[|@ifnextchar[|@xmyverbatimbslash][|string]] |gdef|@xmyverbatimbslash[|] % a newline char before the end of the environment also needs some special handling therefore the gymnastics |catcode`|^^M=|active |long|gdef|@xmyverbatim#1\end{myverbatim}[|@zmyverbatimnewline#1^^M|vskip-|lastskip|vskip|z@skip|end[myverbatim]]% |endgroup

\makeatother

\begin{document}

Normal line Normal line Normal line Normal line Normal line Normal line Normal line Normal line Normal line Normal line Normal line Normal line Normal line Normal line Normal line Normal line Normal line Normal line

\begin{myverbatim} Long line Long line with \ or \ \ is okay but \now breaks the line Long line Long line Long line Long line Long line Long line Long line Long line Long line Long line

Long line Long line Long line Long line Long line Short Line

Short Line # ^& % Short Line \end{myverbatim}

\end{document}

As requested in a comment, the code above now supports \\ as a line break while single usage of \ will generate a backslash. The code gets a little messy in that case, but but that is kind of the price to change catcodes around :-)

Output would then be

enter image description here

Update

Initially I had set \rightskip to 0pt plus 4em basically because I forgot that no hyphenation is going on in this type of environment. So with long "words" that could result in overfull lines. So I now changed it above to 0pt plus 1fil so as long as everything is shorter than a full line one should not get any overfull lines.

Of course line breaks only happen at spaces, so in the example given by the OP it will break after the first set of digits as the second already makes the line overfull.

Update 2

As it turned out, the solution above would swallow one space after a ,in the input (in fact after a certain set of characters). That strange behavior is due to what I would claim is a documentation error in the TeXbook (page 44) namely that

\chardef\%=`\%

is simply a more effective way achieve the same as

\def\%{\char`\%}

In fact it isn't. The latter definition will swallow an optional following space the former doesn't! And that generates the issue as the comma in verbatim is active to prevent ligatures and its definition is generated by the following code

\def\do@noligs#1{%
  \catcode`#1\active
  \begingroup
     \lccode`\~`#1\relax
     \lowercase{\endgroup\def~{\leavevmode\kern\z@\char`#1}}}

which means that it expands to

\leavevmode\kern\z@\char`\,

and thus a following space is swallowed. So either one has to add an additional space into the definition of \do@noligs or one has to change \xobeysp to expand to something like

\def\@xobeysp{\kern\z@\space}%

so that there is always something between the \char and the \space. The latter approach doesn't work for spaces generated by a line break in the source as this is by default translated into a straight space (which would be swallowed). So either one has change the behavior of ^^M (end of line) or one has to use the fix to \do@noligs.

Update 3

Oh well :-) in normal paragraphs (and that was originally requested) spaces at line breaks vanish and this is what the current solution does. If this is not desired then one possibility is to change the definition of \@xobeysp in the code above.

Let's assume we have the following input:

\begin{myverbatim}
xxxxxxxxxxx xxxx xxxxx: 1234567890123456789012345,        123,                                   1234567890,

xxxxxxxxxxx xxxx xxxxx: 1234567890123456789012345, \ 123, 1234567890,

xxxxxxxxxxx xxxx xxxxx: 1234567890123456789012345, \
123, 1234567890, \end{myverbatim}

The the current solution above results in

enter image description here

i.e., spaces at the natural and the forced line break vanish. Now if we instead use the following definition:

\def\@xobeysp{\penalty\z@\char 32 \penalty\z@}%

Which typesets the character in position 32 in the font (space) with a break allowed before and after, then we get the following output:

enter image description here

There are penalties on both sides so that a break can be taken before or after the space character. By default the best break will be the last one, which means the spaces tend to end up on the right side, but in an emergency a break can be taken before the first space char in a row.

Alternatively we could use:

\def\@xobeysp{\mbox{}\space\penalty\z@}%

in which case a space will not vanish into the left margin and we get the same results only that spaces are now "blanks again".

enter image description here

(by the way the indentation is 15 points, a space in typewriter seems to be 5.24995pt which means that the indentation is nearly but not quite 3 spaces, so this might be an area for improvement)

Update 4

With the OP further clarifying what he is looking for, a new requirement showed up:

  • an empty line should result in an empty line in the output, i.e., it should not just end the paragraph (as it normally happens in TeX) but explicity generate an empty line and to empty lines thus generate 2 etc.

So I updated the code above once more to include explicit handling of end of line processing. For this ^^M is made active (i.e., calls a command) rather than doing its standard magic. That command then is looking if it is followed by another ^^M. If not it will execute \@xobeysp (generating whatever is set up in the case for a normal space). On the other hand, if another newline character follows (i.e., if we have an empty line) it will generate a \par followed by \vskip\baselineskip to generate a vertical space equivalent to an empty line. It then continues to look for further newline characters to generate more vertical space if necessary.

There two boundary cases that need special handling which make the code even worse: after \begin{myverbatim} there is typically a newline character that should not trigger this behavior and just before \end{myverbatim} there is another one that shouldn't generate a space. The code above handles the first case well. The second is not working correctly if you do not have \end{myverbatim} on a line by itself, but there you go ... this answer is already much longer that it was ever intended to be :-)

If we again use the above example but add two blank lines in we now get the following:

enter image description here

  • Thank you very much! Seems to also work.. :) Maybe it should also include \usepackage[T1]{fontenc}.. Then I could use norwegian letters like å in the verbatim text... – Håkon Hægland May 12 '13 at 20:40
  • If would be nice to have special symbol that could force line break, e.g. "\". How can this be done? – Håkon Hægland May 13 '13 at 06:56
  • @HåkonHægland what symbol would you like? The point is, any symbol you use can't appear inside your verbatim text or only if escaped, eg by doubling it – Frank Mittelbach May 13 '13 at 10:13
  • I am not sure I understand.. Why is it not possible to use two backslashes "\"? (But entering a single "" should be treated as any other character) – Håkon Hægland May 13 '13 at 10:26
  • 2
    It is possible, I thought you wanted some other char to generate a line break. Anyway, I updated the answer to implement \ as a linebreak – Frank Mittelbach May 13 '13 at 15:36
  • Thank you for all the help! Now I got a new problem with lines becoming too short :) See comment at the end of my question.. – Håkon Hægland May 14 '13 at 11:22
  • 4
    @HåkonHægland looks like you are hit by a documentation bug in the TeX book which resulted in an implementation bug in LaTeX which in turn (due to my reused of the verbatim code) generated trouble for the first time in 30 years :-) – Frank Mittelbach May 14 '13 at 16:59
  • I'd not say it's a documentation error; it's rather one of the many white lies in the TeXbook; the syntax rules say clearly that an optional space is swallowed after all constants, so \char`\% will swallow a space, while \% doesn't. In fact \chardef was introduced just because of this problem, I believe (and then the possibility of using a chardef token in the context of a <number> was added). – egreg May 14 '13 at 17:18
  • @egreg yes and no. It is one of the white lies technicaly speaking as indeed in the syntax section it is documented that there is an optional space there, but given how Don admits to one lie on that page immediately makes you think the rest of the statement is correct. Strangely enough in his book "TeX and Metafont: new directions in typesetting" (1979) the definition is still \def\%{\char´45 } % Note, the space is needed ... My feeling is that with the introduction of the backquote syntax that got thrown out (by mistake and not deliberately to tell lies) – Frank Mittelbach May 14 '13 at 17:27
  • Thanks again, it now seems to work perfectly:) I am very impressed with your solution.. – Håkon Hægland May 14 '13 at 17:33
  • @FrankMittelbach The annotation "Introduce \chardef, analogous to \mathchardef" in errorlog.tex is dated 19 Jan 1983. ;-) – egreg May 14 '13 at 17:41
  • Is there a way not to swallow spaces in the beginning of a line? (If a line starts with e.g. 3 spaces, they seem to be swallowed) – Håkon Hægland May 14 '13 at 18:02
  • @HåkonHægland sure, what do you pay? ;-) Well more to the point what would be the spec? you can't see spaces at the end of the line (and normally they would be placed there, so it is a bit questionable to have some of them show up in front. You can of course show the spaces as visible characters (like in \verb*) then it would make more sense in my eyes – Frank Mittelbach May 14 '13 at 19:32
  • I am not sure.. It do not think spaces at the end of the line are important, but spaces in the beginning are important (e.g to show indentation) especially after a forced line break "\", the spaces should not be swallowed.. – Håkon Hægland May 14 '13 at 19:58
  • If you wonder what I am up to: I planning to use this to format email correspondence :) – Håkon Hægland May 14 '13 at 20:12
  • Thanks again, your last update also seems to work fine :) .. I think I'll now try to understand how your code works such that I (hopefully) can do some modifications myself and you can get a well-earned break :) – Håkon Hægland May 15 '13 at 06:51
  • As it stands now, an empty line will produce a new paragraph with indentation. However, this is not exactly what I want.. How can I add som extra vertical space between paragraphs such that an empty line actually becomes an empty line in the output? (PS: this has no hurry, I just thought if you have time you could have a look at it:) ) – Håkon Hægland May 15 '13 at 07:13
  • Seems like I can add \parskip= "height of a line" within the definition of \obeylines.. I tried 1.2ex and it seems to work fine.. Is there a way to determine the line height exactly? – Håkon Hægland May 15 '13 at 11:13
  • New issue: There seems to be a problem with the characters > and <.. If a word ends with one of these two characters, the space after seems to be gobbled and no line break occurs.. – Håkon Hægland May 15 '13 at 11:35
  • @HåkonHægland it doesn't for me with the updates out of 2 or 3. So my guess is you tried it on the original code. There yes, not only , but also < > and a few other chars would show this behavior. If not you might want to mail me your code. – Frank Mittelbach May 15 '13 at 14:49
  • Thank you for the last update! Seems to work also :). Thank you for putting comments in the source code.. I will have a look at it.. as for now it looks very complicated though:) – Håkon Hægland May 17 '13 at 18:51
  • I am trying to figure out how I can reuse the environment. I would like to have one version that produces, say, slanted text in, e.g. red color, and another version that produces text in black in the upright font.. I do not want to have two copies of the (rather long) code for each case, where the only difference is a single line... I know I can deal with this by adding a parameter (#1) to the environment where I supply the color.. This is fine, but I would rather not like to write \begin{myverbatim}[red] and \begin{myverbatim}[black] each time.. – Håkon Hægland May 17 '13 at 20:27
  • (contd.) Is it possible to define two new environments, say myverbA and myverbB that produces the same effect? It seems it is not possible to define new environments that extends this verbatim environment, see, e.g. Defining a new environment extending a verbatim environment.. – Håkon Hægland May 17 '13 at 20:31
  • It is not a big issue, but it seems that the code will not work together with \usepackage{verbatim}.. (I included the verbatim package by a mistake and got a strange error :) ) – Håkon Hægland May 18 '13 at 12:27
12

You just want the definition of verbatim without the \obeylines part:

!enter image description here

\documentclass{article}

\usepackage[T1]{fontenc}
\makeatletter
\newenvironment{exB}{%
  \trivlist
  \item\relax
   \let\do\@makeother \dospecials
    \verbatim@font \@noligs
  \hyphenchar\font\m@ne
  \catcode`\ \active
  \catcode`\^^M\active
  \catcode`\\\active
  \lccode`\~`\\%
  \lowercase{\let~\getendxB}%
  \lccode`\~`\^^M%
  \lowercase{\let~\space}%
}
  {\endtrivlist}

\def\getendxB#1#2#3#4#5#6#7#8{%
 \def\a{#1#2#3#4#5#6#7#8}%
 \ifx\a\endxB
  \end{exB}%
 \else
  \textbackslash
  \expandafter\a
 \fi
 }

\edef\endxB{end\string{exB\string}}

\makeatother

\begin{document}
\begin{exB}
Long line Long line \ # ^ & Long line Long line Long line Long line Long line Long line Long line Long line Long line Long line Long line Long line Long line Long line Long line % ^ & $ \ ---
Short Line 
Short Line 
Short Line 
\end{exB}

\end{document}
David Carlisle
  • 757,742
  • How can I change the name of the environment? If replace exB with exBB the code will not compile anymore.. – Håkon Hægland May 12 '13 at 21:32
  • @HåkonHægland using the technique used there the env name can be at most 4 characters if you want 4 add #9 to both listst ...#7#8 and then change exB to exBB everywhere. If you want longer names you need a more complicated end environment check. (as \ doesn't have its usual meaning \end{..} doesn't execute automatically, you have to look for it by hand. – David Carlisle May 12 '13 at 21:37
  • It would be nice to have the possibility of choosing a name with more than 4 characters.. – Håkon Hægland May 12 '13 at 21:44
  • 1
    Frank used a more conventional test – David Carlisle May 12 '13 at 21:46
  • @DavidCarlisle guess your code will have the same issues with lig chars as mine did – Frank Mittelbach May 14 '13 at 17:02