I recently moved away from LyX to TeXstudio and am using XeLaTeX to generate PDF. I cannot use PdfLaTeX because of the fontspec package. I rewrote my resume and went to apply for some positions. To my surprise, CV Parsers did not pick up hyphen characters ("-"). So, I began investigating...
Here is a minimal reproducible example:
\documentclass[letterpaper]{article}
\usepackage[left=0.4in,top=0.4in,right=0.4in,bottom=0.4in]{geometry}
\usepackage{enumitem}
\usepackage{fontspec}
\usepackage{ulem}
\usepackage{xstring}
\usepackage{ifthen}
\usepackage[none]{hyphenat}
\pagenumbering{gobble}
\setmainfont{Times New Roman}
\setlength\parindent{0pt}
\begin{document}
\end{document}
In Adobe Acrobat DC, Chrome, and Xournal++, the PDF looks fine.
I tried copying/pasting a hypen in the generated PDF, but nothing seemed to have been copied to the clipboard. Naturally, I quickly wrote an application using Apache PDFBox to list all unicode characters. To my amazement, XeLaTeX does not use unicode Hyphen-Minus (U+002D), it uses Soft Hyphen (U+00AD).
Also, if I copy a block of text in the PDF, sometimes spaces are pasted as newlines.
Note: I am pasting into plaintext areas.
So, my questions are:
- Why is XeLaTeX using Soft Hyphens?
- How do I configure XeLaTeX to use Hyphen-Minus?
- Why do some spaces act like newlines when copied/pasted into plaintext?
Thanks to all.
a-binserts aU+002Dinto the pdf and it copies fine. You will have to show a small example that demonstrates your issue. – Ulrike Fischer Oct 10 '20 at 08:34U+2010for hyphens. It bugs me too, since I cannot search terms in my PDF viewer. – Lei Zhao Dec 11 '20 at 17:09