When I paste text from pdflatex output, there are two things about hyphens that I would like to change:
- Hyphens paste as "hyphen-minus" (U+002D, "-"), but I would prefer the newer Unicode character "hyphen" (U+2010, "‐"), used for textual hyphens.
- Line-breaking hyphens always disappear. This is good if there was no hyphen in the input, but when a word such as "ice-cream" breaks over a line, it would be nice to keep the hyphen.
Note that line-final en-dashes paste correctly. Here is some code illustrating the issue:
\documentclass{article}
\input glyphtounicode
\pdfgentounicode=1
\usepackage[T1]{fontenc}
\begin{document}
Hello World, this is dummy text intended to cause a line break: I love line-breaking. Now a long word without an original hyphen: antidisestablishmentarianism. Some dummy text for getting the desired line breaks; some dummy text for getting the desired line breaks.
Compound adjective (with en-dash): World\ War\ II--related, World\ War\ II--related, World\ War\ II--related.
\end{document}
With this code, the pasting behavior is as follows: the hyphen in "line-breaking" disappears (not desired), the hyphen for the line-breaking of "antidisestablishmentarianism" disappears (desired), and the en-dash remains (desired). (Just in case it matters, I'm using Adobe Reader X (version 10.1.4) on Windows 7.)
Is there an easy way to address these two points? An ideal solution won't make use of new commands (say, using accsupp, as great as this package is) but will modify the way (La)TeX deals with - in its input source code. Also a solution ideally applies the specialized hyphen character (U+2010) conservatively: for example hyphens in URLs are ordinarily simple hyphen-minuses. (Yep, I know that all this might be quite tricky to implement.)
See also my related question here.
hyphen-minus(u+002D) and the en-dash is copied as it is (u+2013) – henrique Sep 22 '12 at 01:09-an active character, then you can redefine its behavior to whatever you want. It might require some TeX wizardry to code all of this correctly. http://tex.stackexchange.com/questions/5737/changing-to-textendash – krlmlr Oct 18 '12 at 21:22