Curly quotes in lstlistings get moved to the start of words

Question

I have a code snippet that uses the unicode character ’ (u2019), also known as right single quotation mark in the middle of a word. Unfortunately, it seems the lstlistings environment causes this character to behave oddly, along with the other quotation marks (u2018, u201c, u201d). It causes them to be moved to the beginning of whatever word they are in. The source code for the image below is

\documentclass[
]{article}
\usepackage{listings}
\begin{document}
\begin{lstlisting}[language=Python]
"It’s a be”auti“ful day in the nei‘borhood"
\end{lstlisting}
It’s a be”auti“ful day in the nei‘borhood
\end{document}

and I would have expected it to generate roughly similar things for both in and out of the code snippet, but for whatever reason lstlisting makes the quotations move to the beginning of the word they interrupt. I am using xelatex.

(it's duplicate of unicode - Having problems with listings and UTF-8. Can it be fixed? - TeX - LaTeX Stack Exchange if the engine is PDF LaTeX, and duplicate of xetex - The 'listings' package and UTF-8 - TeX - LaTeX Stack Exchange otherwise) — user202729, Aug 11 '22 at 10:07
@user202729 Thank you! I looked for a duplicate, but I didn't assume it was because of UTF-8 encoding. Glad to get this cleared up — pouli, Aug 11 '22 at 18:25

pouli · Accepted Answer · 2022-08-15T21:00:40.247

1

As @user202729 pointed out, this is a duplicate question. However, I figured I'd also post the solution that worked for me, just in case anyone in the future is trying to make this work. Adding

\makeatletter
\lst@InputCatcodes
\def\lst@DefEC{%
 \lst@CCECUse \lst@ProcessLetter
  ^^^^2018^^^^2019^^^^201c^^^^201d% punctuation
  ^^00}
\lst@RestoreCatcodes
\makeatother

to the start of the file (before the document begins) will tell lstlistings that it should add those UTF characters to the processing list. Additionally, make sure that you aren't setting extendedchars=false, as it seems to be on by default and needs to be on for this to work.

As @user202729 mentions in the comments, it's also possible to just use the character and not the full character codes; i.e. replace ^^^^2018^^^^2019^^^^201c^^^^201d with ‘’“”.

edited Aug 15 '22 at 21:00

answered Aug 11 '22 at 18:50

pouli

31

side note, as I mentioned in a comment there replacing ^^^^2018^^^^2019^^^^201c^^^^201d with typing ‘’“” in directly also works, saves the need of looking up the code if you can type the characters in directly – user202729 Aug 12 '22 at 00:17
your code doesn't extend the handling of non-ascii, it replaces the existing definition. That means that other non-ascii chars will now break, try e.g. Grüße. – Ulrike Fischer Dec 08 '22 at 09:03

Curly quotes in lstlistings get moved to the start of words

1 Answers1