11

When I pdflatex text containing "fi", e.g. "infinite", the resulting pdf-file looks correct, e.g. "infinite" is shown in the text. But when I search the pdf file for "infinite", 0 occurrences are found. When I copy "infinite" and paste it, the result is "innite". Searching for "innite" results in 0 occurrences as well.

What causes this behavior? My font not being able to handle ligatures? How can I fix it, such that "fi" is treated the same as any other letter combination?

doncherry
  • 54,637
DaveBall aka user750378
  • 1,098
  • 1
  • 12
  • 29
  • What font are you using in which encoding? Try \usepackage[T1]{fontenc}\usepackage{lmodern}. – Martin Scharrer May 30 '12 at 11:41
  • I'm using komascript and the default font (I don't load any package other than komascript that messes with font settings). – DaveBall aka user750378 May 30 '12 at 11:43
  • @Martin: thanks, the result looks identical (ie. I don't see any change in the font), but the "fi"-problem is gone. Could you write your comment as answer? Then you could maybe say what the lmodern package is, and why \usepackage[T1]{fontenc} is not sufficient. And I can accept your answer, to help later readers of this question. – DaveBall aka user750378 May 30 '12 at 11:47

2 Answers2

11

LaTex use ligatures in the text and the PDF-readers treat that as one, unknown character. As far as I remember, I solved the same problem by adding the following two lines in the preamble:

\input{glyphtounicode}
\pdfgentounicode=1

I found this solution in the MinionPro manual, page 7.

I use utf8 encoding in my document, i.e.

\usepackage[utf8]{inputenx}
\usepackage[T1]{fontenc}

Also have a look at Ulrike Fischer’s answer to a similar question regarding Linux Libertine.

glyphtounicode was included in the MiKTeX-distribution I use, but if it is not included in yours, you can find it at Sarovar.

Sveinung
  • 20,355
  • 3
    using only \input{glyphtounicode}\pdfgentounicode=1 without \usepackage[utf8]{inputenx}\usepackage[T1]{fontenc} is sufficient to solve the problem. What's the advantage of using glyphtounicode compared to fontenc (and lmodern as mentioned by Martin)? – DaveBall aka user750378 May 30 '12 at 11:54
  • 1
    I don't know if there is any advantage. My TeX-skills are limited, I am on the level: ‘Find other’s solution to a problem and copy it’. – Sveinung May 30 '12 at 12:05
  • 2
    @Sveinung, perhaps you should add the link to the glyphtounicode file here as well? – henrique May 30 '12 at 13:15
5

This problem can AFAIK be solved by using the Latin modern font in T1 encoding:

\usepackage[T1]{fontenc}
\usepackage{lmodern}

I guess it helps the PDF (viewer) to indicate which real letters are represented by the ligatures.

Martin Scharrer
  • 262,582