I am trying to generate a PDF containing my XML file, which is UTF-8 encoded and contains extended Latin characters. However, there is an issue – Polish characters appear out of place in both listings and listingsutf8 packages. I don't have any Unicode problems elsewhere..
This is what XeLaTeX produces:

This is what the original XML files looks in Notepad++:

Here is my code:
\documentclass[12pt, a4paper, twoside]{article}
\usepackage[margin=2.5cm, bindingoffset=1cm, headheight=15pt]{geometry}
\usepackage[dvipsnames]{xcolor}
\usepackage[utf8]{inputenc}
\usepackage{listings}
\definecolor{Maroon}{rgb}{0.5,0,0}
\definecolor{darkgreen}{rgb}{0,0.5,0}
\lstdefinelanguage{XML_SYNTAX}{%
morekeywords={id},
alsoletter=-,
morestring=[b]",
stringstyle=\color[rgb]{0,0,1},
morecomment=[s]{<?}{?>},
morecomment=[s]{<!--}{-->},
morecomment=[s]{<!}{>},
commentstyle=\color{darkgreen},
moredelim=[s][\color{black}]{![}{]]},
moredelim=*[s][\color{Maroon}]{<}{>},
keywordstyle=\color{red}
}
\lstset{
% Basic design
backgroundcolor=\color[rgb]{0.8,0.8,0.8},
basicstyle={\small\ttfamily},
breaklines=true,
frame=l,
tabsize=2,
% Line numbers
xleftmargin={1.25cm},
numbers=left,
stepnumber=1,
firstnumber=1,
numberfirstline=true,
% HTML formatting
language=XML_SYNTAX,
inputencoding=utf8,
extendedchars=true,
}
\begin{document}
\begin{lstlisting}[%
language=XML_SYNTAX]
<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<raw-text>
spółgłoska wargowo-zębowa bezdźwięczna, czyli cicha (mocna), powiewna, przeciągła. W piśmie występuje przeważnie w wyrazach przyswojonych, w polskich zaś rzadko i głównie w wyrazach dźwiękonaśladowczych, jak fruwać, fiukać i t. p.; w wymawianiu zaś ukazuje się często, chociaż się pisze w, mianowicie na końcu wyrazów i w środku po innych cichych, np. krew, łów, … kwiat, twój, trwały, … które brzmią: kref, łuf, … kfiat, tfuj, trfały…
</raw-text>
\end{lstlisting}
\end{document}
xelatex, you should tag that. And btw, you could try to reduce the example --- just one listing line with<test=" ł þ"> ł þ </test>listing all your non-ascii chars should suffice... I think... – Rmano Jan 31 '21 at 17:32inputenc(which you should not use withxelatex) and addingbabeltoo. Strange. Let's see if some encoding expert see this. – Rmano Jan 31 '21 at 17:52<raw-tex>...</raw-text>will suffice and make thing simpler for everybody – Rmano Jan 31 '21 at 17:56