Which is the better way to do accents in LaTeX?

Question

When you need to use accents in LaTeX like when you need to write the word "Cálculo", that means calculus in Portuguese, you can do in follow ways:

write c\'alculo
use the package \usepackage[utf8]{inputenc}, and just write "cálculo"

I was just using the second way, but my professor said that the better way to write accents is the first way. I didn't find a good reason for it in google, is there any good reason for that?

I have been using your second way for many years and I never ran into a problem because of that. — José Carlos Santos, May 19 '20 at 18:48
You son't even have to load [utf8 ){inputenc} if your latex version is no older than april 2018, since it is now the default encoding. But you still have to load the T1 fontenc. — Bernard, May 19 '20 at 18:53
Anyone using an older editor will prefer the first option. Some are so used to the shortcuts of old editors such that they do not want to switch. I am one of those. So if you are asked to use the old-fashioned way in a joint project, please consider doing that. One may also have concerns with regards to the upload to the arXiv if one relies on very recent changes of what gets loaded automatically. — , May 19 '20 at 18:54
If you’re not submitting to a grader or reviewer who requires you to use legacy fonts, I’d say the better way is to use luatex or xetex and type ś like you normally would. — Davislor, May 19 '20 at 23:22

Mico · Answer 1 · 2020-05-22T10:50:43.017

The only sane reason I can think of for giving in to your professor's preference concerns bibliographies and, specifically, bibliographies created with BibTeX and a bibliography style that sorts entries alphabetically by authors' surnames. (Aside: The issues raised in the remainder of this answer do not pertain to bibliographies generated with biblatex+biber.)

For deeply historical reasons, BibTeX does not sort "accented" characters such as á, ä, à, and â that may occur in the author and editor fields together with the letter A; instead, they are treated (for sorting purposes only) as coming after the letter Z. This affects how pieces authored by, say, Rädermacher, Ràdon, Rámos, and Râmuz [with some deliberate misspellings -- sorry!!] are sorted relative to pieces written by, say, Randall and Rybczynski. Do you want these entries to be placed before Randall or after Rybczynski? You may expect the former to happen, but BibTeX delivers the latter outcome.

Here's another example: Suppose your bibliography contains single-authored entries by Hasbrouck, Haščič, Hase, and Hayworth. Would your expect Haščič's publication to be listed before Hase's -- or after Hayworth's? If you enter the author's name as Haščič instead of as Ha{\v s}\{v c}i{\v c}, BibTeX delivers the second option.

For more information on how to enter accented characters in BibTeX entries in order while avoiding the sorting-related issues noted above, please see the posting How to write “ä” and other umlauts and accented letters in bibliography? [Shameless self-citation alert!]

Well, I can think of a second reason: If your computer keyboard does not provide a straightforward method for entering certain accented characters -- not just á, ä, and â but also, say, angstroms, ogoneks, and thorns [!] -- it's obviously very nice to know that you can enter them as \'a, \"a, \^a, etc as well.

A separate comment: If you compile your document with pdfLaTeX and if you input accented characters directly into your document, I will assume that you also load the fontenc package with the T1 option. If you compile your document with XeLaTeX or LuaLaTeX, there's no need to load the fontenc package.

Addendum: The following MWE, compiled with pdfLaTeX on a MacTeX2020 system, demonstrates that BibTeX places entries by Rädermacher, Ràdon, Rámos, and Râmuz (note the accented characters) after Rybczinski rather than before Randall. Ouch!! Hence, in order to obtain what most people would think is the "correct" sorting outcome, it is necessary to enter these names as R{\"a}dermacher, R{\`a}don, R{\'a}mos, and R{\^a}muz if and when they occur in the author or editor fields of BibTeX entries.

    \documentclass{article}
    \begin{filecontents}[overwrite]{mybib.bib}
    @misc{r1,author="Randall",year=3000,title="Thoughts"}
    @misc{r2,author="Rädermacher",year=3000,title="Thoughts"}
    @misc{r3,author="Ràdon",year=3000,title="Thoughts"}
    @misc{r4,author="Rámos",year=3000,title="Thoughts"}
    @misc{r5,author="Râmuz",year=3000,title="Thoughts"}
    @misc{r6,author="Rybczynski",year=3000,title="Thoughts"}
    \end{filecontents}
    \usepackage[T1]{fontenc} % useful under pdfLaTeX
    \usepackage[authoryear]{natbib}
    \bibliographystyle{plainnat} % use a bib style that sorts entries alphabetically
    \setlength\bibsep{0pt} % optional 
    \begin{document}
    \nocite{*}
    \bibliography{mybib}
    \end{document}

A small note on sorting: Having umlauts or accented characters come last can be correct, depending on the language. Different languages have different rules for collation which are implemented correctly in unicode-native engines. Similarly, pdfLaTeX might have similar issues as bibtex has, when doing things like a sorted glossary. — ljrk, May 21 '20 at 11:37
@larkey - I have zero experience creating glossaries in a LaTeX document. If pdfLaTeX gets tripped up by accented characters in glossary entries, it would be very useful to mention that in a separate answer. — Mico, May 21 '20 at 12:48

Fran · Answer 2 · 2020-05-20T01:20:22.750

8

Actually, you do not need type \'a nor \usepackage[utf8]{inputenc} that is loaded by default, so simply:

\documentclass{article}
\begin{document}
Cálculo
\end{document}

Time ago, with people using a myriad of encodings according to the OS and the idiom, a good reason to type \'a was that will be rendered as "á" also with latin1 or cp437 encodings, for example, whilst á will produce "Ã!" with latin1 encoding or an error with cp437 encoding. Today that almost everybody use only utf8, this is not longer an advantage for sharing LaTeX sources .

As Mico well pointed, using bibtex, composed characters of authors names will be wrongly sorted in some languages.

Have the wrong keyboard is also a good reason, and even some others. There also limitations in math mode: you cannot use $á$ (nor $\'{a}$) bot $\acute{a}$ and sometimes I have had to avoid composed characters in other situations, using this or that command of the package <wathever>. That I cannot remember a concrete scenario illustrated that is a rare situation (or that I have a very bad memory).

But nothing of this obscure the fact that write, read, or check the spelling of a long text with escaped tildes is a pain. So, whenever you can, type á and left \'{a} as the plan B.

edited May 20 '20 at 01:20

answered May 19 '20 at 18:46

Fran

80,769

\usepackage[T1]{fontenc} is still necessary, as far as I know. – Bernard May 19 '20 at 18:55
@Bernad In many cases, of course, as many others depending on the contents, but not for this MWE. – Fran May 19 '20 at 18:59
Even if à exists in OT1 encoding, I suspect other accents might have problems, and anyway, it won't be good for hyphenation… – Bernard May 19 '20 at 19:02
1

T1 would be required if you had more than one word (otherwise you'll get incorrect hyphenation) but you'd also need to specify the language via babel of course to get correct hyphenation, but the question is about input so OK, but you should probably mention UTF8 is only the default in recent latex so not for example at arxiv. On this keyboard it is much easier to type \'a than á so saying using the command forms is torture is over stating the case though – David Carlisle May 19 '20 at 19:04
@Bernard Worth the precision, but the same rule I should have showed the example with microtype, amsmath and so on, but the MWE was to show just the opposite, that today any in necessary in the preamble to type the accents, and work with à and á, â ,ä and othre smpols as ñ, ç, µ, €, æ .... In fact, if you write in Portuguese, Spanish or French without load T1 it could take time note that some characters are not available, as the spanish/frecha guillemots, («»), for example. – Fran May 19 '20 at 19:34
1

@Mico: The BibTeX problem is solved with biblatex+biber… – Bernard May 19 '20 at 19:38
4

@Bernard - Are you serious or just flippant? As far as I can tell, your comment is a classic example of simply assuming away the problem. There are, in fact, some very good reasons for why people must use BibTeX and cannot use biblatex. For instance, if a journal is set up to process submissions that use BibTeX but not biblatex -- and you and I know that there are many such journals... -- then telling the OP that the problem is solved with biblatex+biber is the very opposite of being helpful. – Mico May 19 '20 at 19:51
@Mico, bibtex today also allow uft8 encoding, and you can type authors as João de Santarém s without problems. In fact, sometimes databases that try to solve this old problem with entries like `Jo~{a}o de Satar{'e}but doing that wrongly, causing more problems that using composed characters. – Fran May 19 '20 at 19:54
1

@Mico Yeah, I did not remember the sorting problem with bibtex, and the funny thing is that years ago I learned this by the hard way with an important document. :( – Fran May 19 '20 at 20:53
2

@Mico: I was just assuming the O.P. has these problems for a thesis, and in this case, you can use the bibliography engine you please, as far as I know. – Bernard May 19 '20 at 21:10
-1: the first paragraph is not necessarily true. Standard updated Windows 10, latest version of pdflatex. – Martin Argerami May 20 '20 at 16:29
UTF input encoding is now default in TeXLive 2020, but there may be people still using TeXLive 2017 that comes with Ubuntu 18.04, which does not use UTF by default. – Max Xiong May 20 '20 at 22:01
@MartinArgerami I can produce also this kind of outputs in my Linux system saving the MWE with the wrong encoding. As you can see in your screenshot, the error message come from inputenc, so the package was loaded anyway. It seems that you saved the file accent.txt with a Windows console, that afaik, do not use UTF-8 by default. The default \usepackage[utf8]{inputenc} do not convert a file in uft-8 forrmat, only inform to LaTeX that the encoding is (supposedly) UTF-8, but is up to you ensure that this is true, or otherwise clarify to LaTeX what encoding is really. – Fran May 21 '20 at 02:13
@Fran: reality is that non-utf8 encoding is still pervasive (not that I like it), so what you wrote in your first paragraph does not apply in general. Using the latin1 option with inputenc solves the problem. And the "old fashioned" way with escaped accents will work with any encoding. So, the (sad, if you want) reality is that both options proposed by the OP work in any situation, while yours doesn't. – Martin Argerami May 21 '20 at 02:32
@MartinArgemari My answer assumes the obvious fact that the OP is using already the UTF-8 encoding. Your problem (use another encoding) was not the issue. And since we are picky, is not true that "both options proposed by the OP work in any situation" (\usepackage[utf8]{inputenc} obviously does not work in any situation). The other option, yes, I admit it, but I had already explained it, right? – Fran May 21 '20 at 03:19
@MaxXiong I hope that the long term (ten years) for Ubuntu 18.04 does not mean maintain TeXLive 2017 until 2028. – Fran May 21 '20 at 03:32

ljrk · Answer 3 · 2020-05-21T17:28:27.360

Unicode-Native (La)TeX Engines

Some questions here assume LaTeX=pdfLaTeX, however there are other LaTeX engines out there with native Unicode support, namely LuaLaTeX and XeLaTeX. If you are free to choose, you can simply use these, which will remove many headaches, especially when using accents in listings and the like, they simply work.

Bibliography

The bibliography is not handled by the (La)TeX engine, but a different tool, classically bibtex. If you are free to choose, you don't need to use bibtex anymore but an alternative bibliography backend, biber with another frontend (package), called biblatex.

\documentclass{article}
\usepackage[backend=biber]{biblatex}
%\addbibresource{bibliography.bib}

\begin{document}
•
\end{document}

Glossaries and the like (makeindex)

If you use anything that uses makeindex, you might want to switch to xindy, at least if you want to use non-latin characters for the collated list and it doesn't really matter which TeX engine you use, e.g.:

\documentclass{article}
% xindy option, s.t. a .xdy file is generated
\usepackage[xindy]{glossaries}
\usepackage{iftex}
\ifPDFTeX
  \usepackage[T1]{fontenc}
\else
  \usepackage{fontspec}
\fi

%\makenoidxglossaries
\makeglossaries

\newglossaryentry{Radermacher}
{
  name=Rädermacher,
  description={foo}
}

\newglossaryentry{Radon}
{
  name=Ràdon,
  description={bar}
}
\newglossaryentry{Ramos}
{
  name=Rámos,
  description={baz},
}
\newglossaryentry{Ramuz}
{
  name=Râmuz,
  description={buzz},
}
\newglossaryentry{Rybczinski}
{
  name=Rybczinski,
  description={basse},
}

\begin{document}

\glsaddall
\printglossaries

\end{document}

Running latex mwe, texindy mwe and latex mwe will output the entries in the correct order. Actually, the sorting is language-specific (some languages would sort "a, ä, á, ...", some sort "a, ..., z, ä, á, ...". xindy (and LuaTeX/XeTeX) uses -L to specify the language.

Again, you can circumvent this issue and use {\'a} or the sort=Ramuz key to make it somewhat work, but this will hardcode one sorting and not allow for a flexible change in language as xindy allows.

In Journals

Some journals force you to use pdfLaTeX which, by now, at least uses utf8 as input encoding by default. However you still need the correct mapping of input characters to font glyphs and thus the fontenc package, and might have issues with other packages such as listing when using Unicode symbols. I usually simply use iftex in my preamble and check for pdfTeX to load these helper packages. That way I can mostly just write in Unicode and only have to think a bit harder about the problem when submitting to journals.

ConTeXt

This is not LaTeX, but an alltogether different TeX engine which uses LuaTeX in the backend (thus UTF-8 native) and doesn't need any external tools like bibtex/biber or makeindex/xindy to work. Basically it is the "no frills, no need to worry" variant:

\definesynonyms[glossary][glossaries]
\glossary[Rädermacher]{Rädermacher}{foo}
\glossary[Ràdon]{Ràdon}{bar}
\glossary[Rámos]{Rámos}{baz}
\glossary[Râmuz]{Râmuz}{buzz}
\glossary[Rybczinski]{Rybczinski}{basse}

\starttext
\completelistofglossaries[criterium=all]
\stoptext

Switching to ConTeXt isn't all that easy though, even if many things are arguably better "designed"

You seem to make the assumption that the OP uses pdfLaTeX -- and not either XeLaTeX or LuaLaTeX. I must confess I did not pick up on that. I thought the question was about whether to write cálculo or c\'alculo (and, by extension, about whether to input accented characters directly or via some TeX-provided work-around). To me, that's separate from which engine is in use. — Mico, May 20 '20 at 22:19
I assumed that OP was using pdflatex because otherwise the dilemma of "to be or not to be" of \usepackage[utf8]{inputenc} have little sense, even if today you can have this line without iftex and survive to a xelatex compilation. — Fran, May 21 '20 at 11:24
@Mico I assumed that because of the use of the inputenc package which is not needed on those languages (although modern pdfLaTeX doesn't need it either, but it still needs fontenc). However, bibtex7 still has some issues with non 7-bit ASCII clean input and bibtex8 has problems with collations. — ljrk, May 21 '20 at 11:33