2

I have a code snippet that uses the unicode character (u2019), also known as right single quotation mark in the middle of a word. Unfortunately, it seems the lstlistings environment causes this character to behave oddly, along with the other quotation marks (u2018, u201c, u201d). It causes them to be moved to the beginning of whatever word they are in. The source code for the image below is
enter image description here

\documentclass[
]{article}
\usepackage{listings}

\begin{document}

\begin{lstlisting}[language=Python] "It’s a be”auti“ful day in the nei‘borhood" \end{lstlisting}

It’s a be”auti“ful day in the nei‘borhood

\end{document}

and I would have expected it to generate roughly similar things for both in and out of the code snippet, but for whatever reason lstlisting makes the quotations move to the beginning of the word they interrupt. I am using xelatex.

pouli
  • 31

1 Answers1

1

As @user202729 pointed out, this is a duplicate question. However, I figured I'd also post the solution that worked for me, just in case anyone in the future is trying to make this work. Adding

\makeatletter
\lst@InputCatcodes
\def\lst@DefEC{%
 \lst@CCECUse \lst@ProcessLetter
  ^^^^2018^^^^2019^^^^201c^^^^201d% punctuation
  ^^00}
\lst@RestoreCatcodes
\makeatother

to the start of the file (before the document begins) will tell lstlistings that it should add those UTF characters to the processing list. Additionally, make sure that you aren't setting extendedchars=false, as it seems to be on by default and needs to be on for this to work. enter image description here

As @user202729 mentions in the comments, it's also possible to just use the character and not the full character codes; i.e. replace ^^^^2018^^^^2019^^^^201c^^^^201d with ‘’“”.

pouli
  • 31
  • side note, as I mentioned in a comment there replacing ^^^^2018^^^^2019^^^^201c^^^^201d with typing ‘’“” in directly also works, saves the need of looking up the code if you can type the characters in directly – user202729 Aug 12 '22 at 00:17
  • your code doesn't extend the handling of non-ascii, it replaces the existing definition. That means that other non-ascii chars will now break, try e.g. Grüße. – Ulrike Fischer Dec 08 '22 at 09:03