Extended Latin characters appear out of place in \listings

Question

I am trying to generate a PDF containing my XML file, which is UTF-8 encoded and contains extended Latin characters. However, there is an issue – Polish characters appear out of place in both listings and listingsutf8 packages. I don't have any Unicode problems elsewhere..

This is what XeLaTeX produces:

This is what the original XML files looks in Notepad++:

Here is my code:

\documentclass[12pt, a4paper, twoside]{article}
\usepackage[margin=2.5cm, bindingoffset=1cm, headheight=15pt]{geometry}
\usepackage[dvipsnames]{xcolor}
\usepackage[utf8]{inputenc}
\usepackage{listings}
\definecolor{Maroon}{rgb}{0.5,0,0}
\definecolor{darkgreen}{rgb}{0,0.5,0}
\lstdefinelanguage{XML_SYNTAX}{%
    morekeywords={id},
    alsoletter=-,
    morestring=[b]",
    stringstyle=\color[rgb]{0,0,1},
    morecomment=[s]{<?}{?>},
    morecomment=[s]{<!--}{-->},
    morecomment=[s]{<!}{>},
    commentstyle=\color{darkgreen},
    moredelim=[s][\color{black}]{![}{]]},
    moredelim=*[s][\color{Maroon}]{<}{>},
    keywordstyle=\color{red}
}
\lstset{
    % Basic design
    backgroundcolor=\color[rgb]{0.8,0.8,0.8},
    basicstyle={\small\ttfamily},
    breaklines=true,
    frame=l,
    tabsize=2,
    % Line numbers
    xleftmargin={1.25cm},
    numbers=left,
    stepnumber=1,
    firstnumber=1,
    numberfirstline=true,
    % HTML formatting
    language=XML_SYNTAX,
    inputencoding=utf8,
    extendedchars=true,
}
\begin{document}
\begin{lstlisting}[%
    language=XML_SYNTAX]
<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<raw-text>
    spółgłoska wargowo-zębowa bezdźwięczna, czyli cicha (mocna), powiewna, przeciągła. W piśmie występuje przeważnie w wyrazach przyswojonych, w polskich zaś rzadko i głównie w wyrazach dźwiękonaśladowczych, jak fruwać, fiukać i t. p.; w wymawianiu zaś ukazuje się często, chociaż się pisze w, mianowicie na końcu wyrazów i w środku po innych cichych, np. krew, łów, … kwiat, twój, trwały, … które brzmią: kref, łuf, … kfiat, tfuj, trfały…
    </raw-text>
\end{lstlisting}
\end{document}

Maybe https://tex.stackexchange.com/questions/24528/having-problems-with-listings-and-utf-8-can-it-be-fixed can help? Unfortunately, I do not think that the characters you need are there, but it should be possible to add them... — Rmano, Jan 31 '21 at 17:25
@Rmano I tried, but haven't been able to successfully implement that solution, unfortunately. — MrVocabulary, Jan 31 '21 at 17:29
I do not know which are the plain TeX command to have the problematic chars... so I am sorry I am unable to help. Also I noticed that you use xelatex, you should tag that. And btw, you could try to reduce the example --- just one listing line with <test=" ł þ"> ł þ </test> listing all your non-ascii chars should suffice... I think... — Rmano, Jan 31 '21 at 17:32
I can confirm that it seems that the literate trick does not work, and that with a recent TeXLiVe 2020 I have even a stranger output...: https://i.stack.imgur.com/Rix6S.png — Rmano, Jan 31 '21 at 17:42
@Rmano thanks for checking. I am not very experienced with LaTeX, but I haven't even been able to find a similar bug anywhere. It is weird. — MrVocabulary, Jan 31 '21 at 17:50
I tried removing inputenc (which you should not use with xelatex) and adding babel too. Strange. Let's see if some encoding expert see this. — Rmano, Jan 31 '21 at 17:52
But please reduce the example. Leaving only the excerpt between <raw-tex>...</raw-text> will suffice and make thing simpler for everybody — Rmano, Jan 31 '21 at 17:56
@Rmano this is great! I wonder why I haven't come across that thread. Now I just have to understand that… — MrVocabulary, Jan 31 '21 at 18:09

Extended Latin characters appear out of place in \listings

0 Answers0