2

I am trying to generate a PDF containing my XML file, which is UTF-8 encoded and contains extended Latin characters. However, there is an issue – Polish characters appear out of place in both listings and listingsutf8 packages. I don't have any Unicode problems elsewhere..

This is what XeLaTeX produces: XeLaTeX output

This is what the original XML files looks in Notepad++: Notepad++ view

Here is my code:

\documentclass[12pt, a4paper, twoside]{article}

\usepackage[margin=2.5cm, bindingoffset=1cm, headheight=15pt]{geometry}

\usepackage[dvipsnames]{xcolor} \usepackage[utf8]{inputenc} \usepackage{listings}

\definecolor{Maroon}{rgb}{0.5,0,0} \definecolor{darkgreen}{rgb}{0,0.5,0}

\lstdefinelanguage{XML_SYNTAX}{% morekeywords={id}, alsoletter=-, morestring=[b]", stringstyle=\color[rgb]{0,0,1}, morecomment=[s]{<?}{?>}, morecomment=[s]{<!--}{-->}, morecomment=[s]{<!}{>}, commentstyle=\color{darkgreen}, moredelim=[s][\color{black}]{![}{]]}, moredelim=*[s][\color{Maroon}]{<}{>}, keywordstyle=\color{red} }

\lstset{ % Basic design backgroundcolor=\color[rgb]{0.8,0.8,0.8}, basicstyle={\small\ttfamily}, breaklines=true, frame=l, tabsize=2, % Line numbers xleftmargin={1.25cm}, numbers=left, stepnumber=1, firstnumber=1, numberfirstline=true, % HTML formatting language=XML_SYNTAX, inputencoding=utf8, extendedchars=true, }

\begin{document} \begin{lstlisting}[% language=XML_SYNTAX] <?xml version="1.0" encoding="UTF-8" standalone="yes" ?> <raw-text> spółgłoska wargowo-zębowa bezdźwięczna, czyli cicha (mocna), powiewna, przeciągła. W piśmie występuje przeważnie w wyrazach przyswojonych, w polskich zaś rzadko i głównie w wyrazach dźwiękonaśladowczych, jak fruwać, fiukać i t. p.; w wymawianiu zaś ukazuje się często, chociaż się pisze w, mianowicie na końcu wyrazów i w środku po innych cichych, np. krew, łów, … kwiat, twój, trwały, … które brzmią: kref, łuf, … kfiat, tfuj, trfały… </raw-text> \end{lstlisting} \end{document}

  • 1
    Maybe https://tex.stackexchange.com/questions/24528/having-problems-with-listings-and-utf-8-can-it-be-fixed can help? Unfortunately, I do not think that the characters you need are there, but it should be possible to add them... – Rmano Jan 31 '21 at 17:25
  • @Rmano I tried, but haven't been able to successfully implement that solution, unfortunately. – MrVocabulary Jan 31 '21 at 17:29
  • I do not know which are the plain TeX command to have the problematic chars... so I am sorry I am unable to help. Also I noticed that you use xelatex, you should tag that. And btw, you could try to reduce the example --- just one listing line with <test=" ł þ"> ł þ </test> listing all your non-ascii chars should suffice... I think... – Rmano Jan 31 '21 at 17:32
  • 1
    I can confirm that it seems that the literate trick does not work, and that with a recent TeXLiVe 2020 I have even a stranger output...: https://i.stack.imgur.com/Rix6S.png – Rmano Jan 31 '21 at 17:42
  • @Rmano thanks for checking. I am not very experienced with LaTeX, but I haven't even been able to find a similar bug anywhere. It is weird. – MrVocabulary Jan 31 '21 at 17:50
  • I tried removing inputenc (which you should not use with xelatex) and adding babel too. Strange. Let's see if some encoding expert see this. – Rmano Jan 31 '21 at 17:52
  • 1
    But please reduce the example. Leaving only the excerpt between <raw-tex>...</raw-text> will suffice and make thing simpler for everybody – Rmano Jan 31 '21 at 17:56
  • 2
    see https://tex.stackexchange.com/a/25396/2388 – Ulrike Fischer Jan 31 '21 at 17:58
  • See? the expert answered – Rmano Jan 31 '21 at 18:08
  • @Rmano this is great! I wonder why I haven't come across that thread. Now I just have to understand that… – MrVocabulary Jan 31 '21 at 18:09

0 Answers0