464

As a native English speaker, I've mostly been allowed the luxury of pretending that ASCII is enough, and have been able to treat font encodings as not my problem. I've seen lots of advice that I ought to include \usepackage[T1]{fontenc} in my preamble. (See, for instance, p. 337 of The LaTeX Companion.) What isn't really adequately explained is why I ought to do so.

So, if I am writing in English and have to make only occasional use of things like G\"odel, what does following the advice really get me?

lockstep
  • 250,273
vanden
  • 30,891
  • 23
  • 67
  • 87

2 Answers2

459

Note that this is font encoding (determines what kind of font is used), not input encoding.

The default font encoding (OT1) of TeX is 7-bit and uses fonts that have 128 glyphs, and so do not include the accented characters as individual glyphs. So a letter ö is made by adding an accent to the existing 'o' glyph.

The T1 font encoding is an 8-bit encoding and uses fonts that have 256 glyphs. So an 'ö' is an actual single glyph in the font. Many of the older fonts have had T1 variants devised for them as well, and many newer fonts are available only in T1. I think "Computer Modern" was originally OT1, and "Latin Modern" is T1. (Look at OT1 font encoding and T1 font encoding.)

If you don't use \usepackage[T1]{fontenc},

praseodym
  • 105
ShreevatsaR
  • 45,428
  • 10
  • 117
  • 149
  • 11
    Yes, hyphenation for languages with accented characters is the main reason that requires T1 font encoding. – Jukka Suomela Jul 30 '10 at 18:58
  • 8
    Yep, definitely load lmodern (or some other font package) when you use T1. – Will Robertson Jul 31 '10 at 02:37
  • 19
    The bigger problem is that without T1 you cannot copy/paste a name with a non-ascii glyph without getting them split up into their components. – Christopher Oezbek Aug 13 '10 at 16:43
  • @Christopher: You will have the same problem even if you use T1 in combination with something like ae. – Jukka Suomela Aug 15 '10 at 19:18
  • @Jukka: No, T1 does contain the æ(æ) character, and I just checked that I can copy it from PDF if I use T1. Of course, the set of glyphs in T1 is finite too, and if you encounter a glyph not covered, you would have the problem. – ShreevatsaR Aug 15 '10 at 20:05
  • @ShreevatsaR: I was referring to the ae package, not \ae character. T1 fontenc + ae package + accented characters: hyphenation works ok, but you will have copy-paste problems. – Jukka Suomela Aug 15 '10 at 21:09
  • 5
    @Jukka: I am not sure why you would still use the ae package. – Christopher Oezbek Aug 22 '10 at 10:04
  • @Christopher: Looks good on screen (proper hinting), unlike the alternatives. – Jukka Suomela Aug 22 '10 at 15:41
  • 8
    So, what about fonts having >256 glyphs? There must exist some, right? – letmaik Oct 09 '12 at 18:50
  • 2
    @letmaik you need truetype or opentype fonts for that. They support 16 bit (65000+). – jiggunjer Jul 25 '16 at 11:51
  • 2
    Thanks so much for pointing out that without \usepackage[T1]{fontenc}, Umlauts and accented characters cannot be copied properly from the output! This has been bothering me for ages! – Janosh Sep 17 '17 at 10:33
  • 3
    @letmaik (Answering 5 years later…) In non-Unicode-aware TeX engines (Knuth TeX and pdfTeX) there are no fonts with >256 glyphs. The output can have any number of glyphs of course, but that's achieved via a combination of fonts, at most 256 glyphs to a font. T1 is merely an encoding for Latin script plus some accents, which happens to cover many European languages. Math fonts will use their own different encoding (not T1) of where mathematical symbols should go, for example. The Unicode-aware engines are XeTeX and LuaTeX, and with them you can use Unicode (OpenType) fonts with 1000s of glyphs. – ShreevatsaR Sep 17 '17 at 19:02
  • 4
    You should also chose the font encoding based on the language of your text. As pointed out here: "The T1 encoding contains letters and punctuation characters for most of the European languages using Latin script. For languages using Cyrillic script you can use T2A, T2B, T2C, or X2 font encodings." [Overleaf] – LEo May 12 '20 at 17:59
134

In addition to the reasons listed by @ShreevatsaR why the T1 font encoding is advisable even when writing (primarily) in the English language, there are two more reasons that were missing from his list:

  • TeX is only able to apply ligatures and kernings between characters when these characters are real glyphs from the same font. In OT1 (with 128 glyphs) you only have more or less ASCII characters and all diacritics etc are missing.

  • Searching in the output is not working whenever a diacritic character is being used as that ends up being a complicated box structure in the output and not a character.

So to stay with your example of the occasional G\"odel: if the font designer has decided that because of the shape of the G it needs some kerning to a following o or ö then he can specify this in T1 but not in OT1 (for the ö as that is not a single glyph in that font encoding). And there are a lot of kerning adjustments between characters.

The second point means that if somebody is searching through your papers (put up on the web as pdf's, for example) for the name Gödel, then the name wouldn't be found.

So in short the T1 fonts simply give slightly better output whenever there is a single diacritic char used, because kerning is better, hyphenation still works, cut-and-paste still works from the output and searching in the output works properly as well.

  • 1
    Do you agree that we have to load lmodern when using T1? – Sigur Nov 24 '18 at 23:58
  • 5
    @Sigur you don't have to; depending on your installation Computer Modern has T1 encoded fonts too. But lmodern is a good replacement so you can do that for sure. – Frank Mittelbach Nov 25 '18 at 07:56