Enable searching in a pdflatex-generated document

Question

I'm typesetting a book in French, and I noticed that diacritics broke the search function of Adobe reader. For example, if I write "Numérotation" in my latex document, then Adobe Reader won't find any matches for either "Numerotation" or "Numérotation", since the accent sign is written to the pdf document separately from the "e".

Similarly, if I write "Affect", then Adobe reader won't find any matches for "Affect" : both "f"s are contracted into one character.

How can I ensure that the pdfs generated by pdflatex are fully searchable?

EDIT: Some clarifications:

Thanks for the answers. I currently have two packages loaded :

\usepackage[T1]{fontenc}
\usepackage{ae,aecompl,aeguill}

Adding \usepackage{cmap} doesn't help. Removing ae does help, but the the fonts are bitmap ones, and look terrible.

Final note: removing ae doesn't solve the double-f (ff) problem.

Does using the package cmap, i.e. including \usepackage{cmap} in your preamble, help? — N.N., Oct 10 '11 at 11:15
At first you should use \usepackage[T1]{fontenc} (and if the fonts get fuzzy install the cm-super-package). — Ulrike Fischer, Oct 10 '11 at 11:33
@Clément ulrike said use cm-super; personally i prefer lmodern.sty which is (a) smaller, and (b) very slightly better shaped; so \usepackage[T1]{fontenc}\usepackage{lmodern} — wasteofspace, Feb 08 '14 at 11:54

score 12 · Accepted Answer · answered Oct 10 '11 at 12:22

12

Fonts in OT1 encoding make up accented characters by combining two glyphs, so they won't be recognized as such. If you have the CM-Super fonts, then

\usepackage[T1]{fontenc}

will suffice. If you don't have the CM-Super fonts and are stuck with the bitmaps generated by Metafont, then

\usepackage{cmap}

before the other declaration will do. Also the Latin Modern fonts have the correct CMAP entries in the font files, so the glyphs will be recognized by the PDF reader (and they are Type1 fonts). Most font families should give no problems, provided they have Type1 font files and T1 encoding is used.

Other encodings such as T2x (x can be A, B, or C) can suffer from the same problem and the cure is the same: declaring the encoding and ensuring that the font files are Type1. No problem (at least in the majority of cases) when OpenType fonts are used with XeLaTeX or LuaLaTeX.

answered Oct 10 '11 at 12:22

egreg

1,121,712

I already added \usepackage[T1]{fontenc}, and adding cmap doesn't help. – Clément Oct 10 '11 at 13:01
ae is the problem here, but removing it gives bitmap fonts. And it doesn't solve the double-f problem. – Clément Oct 10 '11 at 13:05
5

Don't use ae, it's obsolete; try \usepackage{lmodern} with \usepackage[T1]{fontenc} or install the CM-Super fonts. – egreg Oct 10 '11 at 13:15
lmodern was the key =) – Clément Oct 10 '11 at 13:49
related: http://tex.stackexchange.com/questions/1291/why-are-bitmap-fonts-used-automatically – matth Jul 04 '12 at 15:44

score 3 · Answer 2 · answered Oct 10 '11 at 21:42

3

Since nobody mentioned it yet: Using XeLaTeX or LuaLaTeX properly would also solve this since you would then use Unicode. I sincerely recommended this.

answered Oct 10 '11 at 21:42

Martin Schröder

15,156

Enable searching in a pdflatex-generated document

2 Answers2

Linked