11

When I used listings package to import C# code snippet from within my LaTeX file, I got an output as follows.

alt text

At the first line, there is an unnecessary white space because of the BOM (Byte Order Mark) that is intentionally added by Microsoft (R) Visual Studio.

I really want to remove it from the output rather than from the C# file, how to do that?

Here is my LaTeX code:

\documentclass{article}
\usepackage{listings,xcolor}
\lstset{%
language={[Sharp]C},
backgroundcolor=\color{yellow!20},
basicstyle=\tiny,
keywordstyle=\color{blue},
identifierstyle=\color{magenta},
breaklines=true}


\begin{document}
\lstinputlisting{Program.cs}
\end{document}
Display Name
  • 46,933

3 Answers3

15

There is actually a space only because you are using an OT1-font. With T1-encoding you would see the three chars coded by the three octetts of the BOM. Assuming that none of them is used in the listing and that your main document is 8-bit encoded you can replace them with literate:

 \documentclass{article}
 \usepackage[T1]{fontenc}
 \usepackage{listings}
 \usepackage{xcolor}
 \lstset{%
 language={[Sharp]C},
 backgroundcolor=\color{yellow!20},
 basicstyle=\tiny,
 keywordstyle=\color{blue},
 identifierstyle=\color{magenta},
 breaklines=true}

 \lstset{
   literate={ï}{}0
            {»}{}0
            {¿}{}0
 }
 \begin{document}
 \lstinputlisting{test-bom.txt}
 \end{document}
Ulrike Fischer
  • 327,261
  • 1
    Thanks for the solution. However, I think the BOM preceding the keyword "using" causes listings.sty not to interpret it as C# keyword. In addition, the result also contains unwanted space. Please see my EDIT 2 in my post above. – Display Name Nov 26 '10 at 01:20
  • 1
    The problem are the second set of braces in the literate definition I forgot to delete. I have now deleted them in the code. – Ulrike Fischer Nov 26 '10 at 08:35
  • @Ulrike, @xport: is this actually safe? I assume that the C# source code is utf-8 throughout? – Taco Hoekwater Nov 26 '10 at 10:39
  • Well as I said: assuming that none of it is used in the file. If the listing is ascii there should be no problem. But thinking about it, literate={}{}0 is probably better. – Ulrike Fischer Nov 26 '10 at 10:59
  • 2
    As I recall, the BOM, U+FEFF, (which is stupid for utf-8 since there is no byte order) is not a valid unicode character so there should be no danger of it appearing elsewhere. (Okay, that's not quite true. If it appears in the middle of the stream, it's supposed to be treated as a zero-width non-breaking space.) That said, Ulrike's comment just above this one is probably the way to go. – TH. Nov 26 '10 at 14:08
  • The current version of listings doesn't handle utf-8 and the OP don't use listingsutf8. So the code probably doesn't contain non-ascii-chars and so it is more an academic discussion. The literate={}{}0 looks cleaner, but the splitted variant should work fine too. – Ulrike Fischer Nov 26 '10 at 15:43
  • @Ulrike: You mean that if I use listingsutf8 then I no longer need literate={}{}0? – Display Name Jun 19 '11 at 00:08
  • 1
    @xport: Why don't you try it yourself? And if it doesn't work ask if and how code to remove the BOM? – Ulrike Fischer Jun 19 '11 at 10:07
11

Perhaps this is not a usable solution for you, but using either lualatex or xelatex instead of pdflatex fixes it.

Taco Hoekwater
  • 13,724
  • 43
  • 67
  • 1
    With all due respect to @UlrikeFischer, this is a much better solution, if you can go with the XeTeX or LuaTeX engines (it's the future knocking on your door!!). If your code, one day, ever needs some French quotes, or Spanish questions, you're going to be mystified. As in ¿« Maïz » en francés quiere decir "Maíz"? – Brent.Longborough Mar 13 '12 at 09:01
5

I had the same problem, and found a simple solution.

\usepackage[utf8x]{inputenc}

\lstset{ 
  extendedchars=\true
}

gives output with correct colouring and no extra space.

Vaulty
  • 59
  • 4
    As noted in the comments to http://tex.stackexchange.com/questions/19210/how-to-prevent-extendedchars-true-from-producing-a-blank-line this only works by accident is not a valid option for listings. – Caramdir May 27 '11 at 19:14
  • I believe [utf8] is preferred over [utf8x], which is obsolescent. – Brent.Longborough Mar 13 '12 at 09:02