12

I'm typesetting a book in French, and I noticed that diacritics broke the search function of Adobe reader. For example, if I write "Numérotation" in my latex document, then Adobe Reader won't find any matches for either "Numerotation" or "Numérotation", since the accent sign is written to the pdf document separately from the "e".

Similarly, if I write "Affect", then Adobe reader won't find any matches for "Affect" : both "f"s are contracted into one character.

How can I ensure that the pdfs generated by pdflatex are fully searchable?

EDIT: Some clarifications:

Thanks for the answers. I currently have two packages loaded :

\usepackage[T1]{fontenc}
\usepackage{ae,aecompl,aeguill}

Adding \usepackage{cmap} doesn't help. Removing ae does help, but the the fonts are bitmap ones, and look terrible.

Final note: removing ae doesn't solve the double-f (ff) problem.

Clément
  • 4,004

2 Answers2

12

Fonts in OT1 encoding make up accented characters by combining two glyphs, so they won't be recognized as such. If you have the CM-Super fonts, then

\usepackage[T1]{fontenc}

will suffice. If you don't have the CM-Super fonts and are stuck with the bitmaps generated by Metafont, then

\usepackage{cmap}

before the other declaration will do. Also the Latin Modern fonts have the correct CMAP entries in the font files, so the glyphs will be recognized by the PDF reader (and they are Type1 fonts). Most font families should give no problems, provided they have Type1 font files and T1 encoding is used.

Other encodings such as T2x (x can be A, B, or C) can suffer from the same problem and the cure is the same: declaring the encoding and ensuring that the font files are Type1. No problem (at least in the majority of cases) when OpenType fonts are used with XeLaTeX or LuaLaTeX.

egreg
  • 1,121,712
3

Since nobody mentioned it yet: Using XeLaTeX or LuaLaTeX properly would also solve this since you would then use Unicode. I sincerely recommended this.