11

I have the following minimal working example:

\documentclass{book}

\usepackage[utf8]{inputenc}
\usepackage[T1]{fontenc}

\begin{document}

é canção

\end{document}

I need to use the utf8 argument in order to have special characters in a simpler way and I need the T1 argument in order to copy the special characters correctly from pdf.

The problem is, when I copy the text from the pdf reader (I am using Foxit Reader) the space won't come out with the text, resulting in écanção being copied (notice that the space didn't come out with the text).

How to solve this?

Weslei
  • 4,015
  • 8
  • 32
  • 37
  • 5
    This is down to the PDF viewer: TeX does not use 'space' characters for spacing, so they have to be interpreted as such by the program showing the PDF. – Joseph Wright Sep 10 '11 at 20:51
  • 3
    @Joseph so does that make TeX created documents problematic w.r.t. accessibility for the blind, for example? – Alan Munn Sep 10 '11 at 20:55
  • @Joseph You are right, since I've tried another example (without inputenc being loaded) and Foxit Reader didn't handle the spaces when copying the text. However, even being a problem of the pdf viewer, isn't there a workaround? – Weslei Sep 10 '11 at 20:59
  • 3
    @Alan: The 'space' character has a fixed width - not just an issue for TeX, but for anything that is typesetting as opposed to writing text. Many viewers (most obviously Adobe Reader) handle this quite well. There is the 'ActualText' feature in the PDF spec, but I'm not sure how easy it would be to set up all spaces to use it. – Joseph Wright Sep 10 '11 at 21:06
  • 1
    @Weslei There's a similar question in SuperUser that might help you: http://superuser.com/questions/50496/copying-from-foxit-reader-makes-spaces-disappear – Paulo Cereda Sep 10 '11 at 21:07
  • @Paulo Thanks for the question link, but my problem is that the produced document is intented for a large audience. This would make it difficult to them to extract parts of the document depending on their pdf reader. – Weslei Sep 10 '11 at 21:13
  • Maybe you can try CAJviewer! I use it when I suffered the same problem as yours and found the CAJviewer is good than Adobe-Reader on handle this issue! –  Apr 05 '13 at 02:47

1 Answers1

11

The original purpose of PDF was to represent printed documents, and there was no explicit way of showing a space character. With modern developments around PDF, people are interested in things like automatic reflow for small screens and structural information for document processing or interfacing to screenreaders for people with visual impairment. Due to this, it is now possible to represent the spaces explicitly (I believe it is even a requirement for some grade of PDF/A compliance). There is a patch for pdftex here after which I believe you are supposed to add the following to your tex file:

\pdfmapline{+dummy-space <dummy-space.pfb}
\pdfgeninterwordspace

I don't know if the patch still applies (the bug tracker claims it has been replaced by a branch, probably this one), and I haven't tested it to see if it actually solves the problem.

Lev Bishop
  • 45,462