5

I'm trying to use the pdfcomment package in XeLaTeX. I can't seem to be able to use Unicode characters such as Czech ň or ě properly in \pdfmarkupcomment. Usually any such trailing characters of the comment disappear. Here's a minimal example:

\documentclass{article}
\usepackage{fontspec}
\usepackage{polyglossia}
\setdefaultlanguage{czech}
\usepackage{pdfcomment}
\hypersetup{unicode}
\begin{document}
Běžně. Běžně.

\pdfmarkupcomment{Běžně}{}. Běžně. % This would be typeset as "Běžn. Běžně" leaving out "ě"
\end{document}

The pdfcomment package documentation says:

Internally, the argument ⟨comment⟩ needs to be converted to PDFDocEncoding/PDFUnicode.

and then it just says \hypersetup{unicode} in a footnote to this, which is what I put in my LaTeX document, but it does not seem to solve the problem. (I have no idea what "needs to be converted to PDFDocEncoding/PDFUnicode" means.)

Alternatively, if there is a different way to highlight arbitrary parts of text (i.e. without creating a box around it), that would work OK with Unicode, I will be happy to use it.

  • Note that \pdfmarkupcomment wants two mandatory arguments. The problem seems to be a bug in soulpos. – egreg Mar 08 '15 at 14:50
  • Thank you for pointing this out, this was the cause of the immediately following character disappearing. I have edited the question to reflect this. – Adam Nohejl Mar 08 '15 at 15:18

1 Answers1

4

Hinted by egreg in comment, I found this.

It seems like soul tries analysising your string by typeseting a prototype in font \SOUL@tt. If the last few characters are missed in that font, soul would found that it just built a zero-width box. It then think that the current string is finished and stop working.

\documentclass{article}
\usepackage{fontspec,pdfcomment}
\begin{document}

\ulposdef{\uline}{\rule[-.8ex]{\ulwidth}{.5pt}}

\uline{Běžně}
\pdfmarkupcomment{Běžně}{}. Běžně. % This would be typeset as "Běžn. Běžně" leaving out "ě"
%leaving out "ě"

\makeatletter
\let\SOUL@tt\ttfamily
\makeatother

\uline{Běžně}
\pdfmarkupcomment{Běžně}{}. Běžně. % This would be typeset as "Běžn. Běžně" leaving out "ě"
%leaving out "ě"

\end{document}
Symbol 1
  • 36,855
  • Oh, yes! I forgot about this one! Probably soul should check for fontspec and then do that assignment itself. – egreg Mar 09 '15 at 10:21