19

I'm using TeX with the plain format. On my keyboard there is the 'è' character (as well as é, ò, à, ù, ì). I'd like to make it work so that, when put è in the input file, TeX transforms it into \`e, i.e. the letter e with an accent. I think I have to change the character code of è and do something like that, but I don't know what to do exactly. Can you help me?

jarnosc
  • 4,266
User
  • 2,530
  • 1
  • 14
  • 25
  • It depends on what encoding you use for saving the file. – egreg Mar 05 '14 at 17:22
  • @egreg My text editor allows me (or, at least, I think so) only UTF-8. Is it the right encoding? – User Mar 05 '14 at 17:26
  • There is no "right" or "wrong" encoding. The problem is to know which one. I have some code for this, but you'll have to wait till I can get my hands over it. – egreg Mar 05 '14 at 17:28
  • @egreg Ah, sorry, I misunderstood you. I realize only now that what you meant is that it is the solution that depends on the encoding. I can wait, thanks :) – User Mar 05 '14 at 17:31
  • 2
    You can also have it so that you just input é, and you get é with a Unicode-aware engine such as XeTeX or LuaTeX and a OpenType Font. – morbusg Mar 05 '14 at 21:05
  • There is no "right" or "wrong" encoding. The problem is to know which one. I have some code for this, but you'll have to wait till I can get my hands over it. – egreg Mar 5 '14 at 17:28

    @egreg what do you meant by "which one"? We should use the font encoding on the file?

    – blmayer Jan 15 '16 at 04:46
  • @BrianMayer I meant “which one you used for saving your file”. – egreg Jan 15 '16 at 09:12
  • Thanks #egreg. This is very nice, simple, so is a matter of using the same encoding of the font for the file, I tried and it worked exquisitely. Moreover there's a post related to that here. – blmayer Jan 15 '16 at 15:58

2 Answers2

24

If you want to use Knuth TeX you'll have a hard time. With pdftex it's easier, because there are some useful features coming from e-TeX extensions.

Here's a seemingly working setup (I add only a reduced version of the first file, for the limitation in characters here.

utfplainmac.tex

% -*- coding: utf-8 -*-
% We set a safe catcode for ^ and ^^^; XeTeX uses the ^^^^ convention for
% specifying arbitrary 16 bit code points. So if XeTeX is used, \gobble
% eats up ^^^^0021, while with an 8 bit engine only ^^^ is
% swallowed and \next is not \empty. In the end, \ifunicode is \iftrue if
% the engine is Unicode aware, it is \iffalse if the engine is 8 bit.

\catcode\^=7 \catcode~=\active

\newif\ifunicodeengine \begingroup \catcode30=12 % just in case: 30 is `^^^ \def\gobble#1#2{} \edef\next{\gobble^^^^0021} \expandafter\endgroup \ifx\next\empty\unicodeenginetrue\else\unicodeenginefalse\fi

\message{Engine is \ifunicodeengine Unicode aware\else 8 bit\fi, loading UTF-8 combinations}

\ifunicodeengine %%% Make the first argument active and define it as the fourth %%% The trick avoids a global definition: the \lowercase changes %%% ~ into #1 as active character; then \endgroup\def#1 is put %%% back into the token stream (here #1 stands for the actual %%% character given as argument); the same trick is used for %%% \UseUnicodeCharacter, which must have an argument expressed %%% as a four digit hexadecimal number (with uppercase A..F). \def\DoUTFCombination#1#2#3#4{\catcode"#1\active \begingroup\lccode~="#1\lowercase{\endgroup\def~}{#4}} \def\UseUnicodeCharacter#1{\begingroup\lccode~="#1\lowercase{\endgroup~}} \else %%% The UTF-8 prefixes are made active; they just look at the %%% following token, which is a category 12 character unless something %%% strange has happened, and forms with it a control sequence that %%% will be defined later \catcode\^^c2=\active \def^^c2#1{\csname UTFprefix-c2#1\endcsname} \catcode^^c3=\active \def^^c3#1{\csname UTFprefix-c3#1\endcsname} \catcode\^^c4=\active \def^^c4#1{\csname UTFprefix-c4#1\endcsname} \catcode^^c5=\active \def^^c5#1{\csname UTFprefix-c5#1\endcsname} \catcode\^^c6=\active \def^^c6#1{\csname UTFprefix-c6#1\endcsname} \catcode^^c7=\active \def^^c7#1{\csname UTFprefix-c7#1\endcsname} \catcode\^^c8=\active \def^^c8#1{\csname UTFprefix-c8#1\endcsname} \catcode^^cb=\active \def^^cb#1{\csname UTFprefix-cb#1\endcsname}

%%% If the file is input by a UTF-8 unaware engine, we define the main %%% command that associates the UTF-8 character (actually a two byte %%% combination) to a list of tokens; we define also %%% \UseUnicodeCharacter to access the same replacement text via an %%% auxiliary macro \UTFCodePoint-xxxx, where xxxx stands for the %%% argument to \UseUnicodeCharacter, a four digit hexadecimal number %%% (uppercase A..F). \def\DoUTFCombination#1#2#3#4{% \expandafter\def\csname UTFprefix-#2#3\endcsname{#4}% \expandafter\def\csname UTFCodePoint-#1\endcsname{#4}% } \def\UseUnicodeCharacter#1{\csname UTFCodePoint-#1\endcsname} \fi

%%% Some (actually many) UTF-8 characters cannot be printed with T1 %%% or TS1 encoded fonts \newif\ifUTFwarning \UTFwarningtrue \def\BadUTF#1{% \ifUTFwarning \global\UTFwarningfalse \errhelp{Look in the log file for unsupported characters}% \errmessage{Unsupported UTF character}% \fi \wlog{Character #1 not currently supported on line \the\inputlineno}% }

%%% A shorthand for choosing the text companion font \def\tcsym#1{{\tcfont\char#1}}

%%% The list of characters: Unicode code point, prefix and second %%% byte, then the definition. \DoUTFCombination{00A0}{c2}{^^a0}{~} % NO-BREAK SPACE \DoUTFCombination{00A1}{c2}{^^a1}{!} % INVERTED EXCLAMATION MARK \DoUTFCombination{00A2}{c2}{^^a2}{\tcsym{"8B}} % CENT SIGN \DoUTFCombination{00A3}{c2}{^^a3}{\pound} % POUND SIGN \DoUTFCombination{00A4}{c2}{^^a4}{\tcsym{"A4}} % CURRENCY SIGN \DoUTFCombination{00A5}{c2}{^^a5}{\tcsym{"A5}} % YEN SIGN \DoUTFCombination{00A6}{c2}{^^a6}{\tcsym{"A6}} % BROKEN BAR \DoUTFCombination{00A7}{c2}{^^a7}{\tcsym{"A7}} % SECTION SIGN \DoUTFCombination{00A8}{c2}{^^a8}{\"{}} % DIAERESIS \DoUTFCombination{00A9}{c2}{^^a9}{\tcsym{"A9}} % COPYRIGHT SIGN \DoUTFCombination{00AA}{c2}{^^aa}{\tcsym{"AA}} % FEMININE ORDINAL INDICATOR \DoUTFCombination{00AB}{c2}{^^ab}{>} % RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK \DoUTFCombination{00BC}{c2}{^^bc}{\tcsym{"BC}} % VULGAR FRACTION ONE QUARTER \DoUTFCombination{00BD}{c2}{^^bd}{\tcsym{"BD}} % VULGAR FRACTION ONE HALF \DoUTFCombination{00BE}{c2}{^^be}{\tcsym{"BE}} % VULGAR FRACTION THREE QUARTERS \DoUTFCombination{00BF}{c2}{^^bf}{?} % INVERTED QUESTION MARK

\DoUTFCombination{00C0}{c3}{^^80}{`A} % LATIN CAPITAL LETTER A WITH GRAVE \DoUTFCombination{00C1}{c3}{^^81}{'A} % LATIN CAPITAL LETTER A WITH ACUTE \DoUTFCombination{00C2}{c3}{^^82}{^A} % LATIN CAPITAL LETTER A WITH CIRCUMFLEX \DoUTFCombination{00C3}{c3}{^^83}{~A} % LATIN CAPITAL LETTER A WITH TILDE \DoUTFCombination{00C4}{c3}{^^84}{"A} % LATIN CAPITAL LETTER A WITH DIAERESIS \DoUTFCombination{00C5}{c3}{^^85}{\AA} % LATIN CAPITAL LETTER A WITH RING ABOVE \DoUTFCombination{00C6}{c3}{^^86}{\AE} % LATIN CAPITAL LETTER AE \DoUTFCombination{00C7}{c3}{^^87}{\c{C}} % LATIN CAPITAL LETTER C WITH CEDILLA \DoUTFCombination{00C8}{c3}{^^88}{`E} % LATIN CAPITAL LETTER E WITH GRAVE \DoUTFCombination{00C9}{c3}{^^89}{'E} % LATIN CAPITAL LETTER E WITH ACUTE \DoUTFCombination{00CA}{c3}{^^8a}{^E} % LATIN CAPITAL LETTER E WITH CIRCUMFLEX \DoUTFCombination{00CB}{c3}{^^8b}{"E} % LATIN CAPITAL LETTER E WITH DIAERESIS \DoUTFCombination{00CC}{c3}{^^8c}{`I} % LATIN CAPITAL LETTER I WITH GRAVE \DoUTFCombination{00CD}{c3}{^^8d}{'I} % LATIN CAPITAL LETTER I WITH ACUTE \DoUTFCombination{00CE}{c3}{^^8e}{^I} % LATIN CAPITAL LETTER I WITH CIRCUMFLEX \DoUTFCombination{00CF}{c3}{^^8f}{"I} % LATIN CAPITAL LETTER I WITH DIAERESIS \DoUTFCombination{00D0}{c3}{^^90}{\DH} % LATIN CAPITAL LETTER ETH \DoUTFCombination{00D1}{c3}{^^91}{~N} % LATIN CAPITAL LETTER N WITH TILDE \DoUTFCombination{00D2}{c3}{^^92}{`O} % LATIN CAPITAL LETTER O WITH GRAVE \DoUTFCombination{00D3}{c3}{^^93}{'O} % LATIN CAPITAL LETTER O WITH ACUTE \DoUTFCombination{00D4}{c3}{^^94}{^O} % LATIN CAPITAL LETTER O WITH CIRCUMFLEX \DoUTFCombination{00D5}{c3}{^^95}{~O} % LATIN CAPITAL LETTER O WITH TILDE \DoUTFCombination{00D6}{c3}{^^96}{"O} % LATIN CAPITAL LETTER O WITH DIAERESIS \DoUTFCombination{00D7}{c3}{^^97}{\tcsym{"D6}} % MULTIPLICATION SIGN \DoUTFCombination{00D8}{c3}{^^98}{\O} % LATIN CAPITAL LETTER O WITH STROKE \DoUTFCombination{00D9}{c3}{^^99}{`U} % LATIN CAPITAL LETTER U WITH GRAVE \DoUTFCombination{00DA}{c3}{^^9a}{'U} % LATIN CAPITAL LETTER U WITH ACUTE \DoUTFCombination{00DB}{c3}{^^9b}{^U} % LATIN CAPITAL LETTER U WITH CIRCUMFLEX \DoUTFCombination{00DC}{c3}{^^9c}{"U} % LATIN CAPITAL LETTER U WITH DIAERESIS \DoUTFCombination{00DD}{c3}{^^9d}{'Y} % LATIN CAPITAL LETTER Y WITH ACUTE \DoUTFCombination{00DE}{c3}{^^9e}{\TH} % LATIN CAPITAL LETTER THORN \DoUTFCombination{00DF}{c3}{^^9f}{\ss} % LATIN SMALL LETTER SHARP S \DoUTFCombination{00E0}{c3}{^^a0}{`a} % LATIN SMALL LETTER A WITH GRAVE \DoUTFCombination{00E1}{c3}{^^a1}{'a} % LATIN SMALL LETTER A WITH ACUTE \DoUTFCombination{00E2}{c3}{^^a2}{^a} % LATIN SMALL LETTER A WITH CIRCUMFLEX \DoUTFCombination{00E3}{c3}{^^a3}{~a} % LATIN SMALL LETTER A WITH TILDE \DoUTFCombination{00E4}{c3}{^^a4}{"a} % LATIN SMALL LETTER A WITH DIAERESIS \DoUTFCombination{00E5}{c3}{^^a5}{\aa} % LATIN SMALL LETTER A WITH RING ABOVE \DoUTFCombination{00E6}{c3}{^^a6}{\ae} % LATIN SMALL LETTER AE \DoUTFCombination{00E7}{c3}{^^a7}{\c{c}} % LATIN SMALL LETTER C WITH CEDILLA \DoUTFCombination{00E8}{c3}{^^a8}{`e} % LATIN SMALL LETTER E WITH GRAVE \DoUTFCombination{00E9}{c3}{^^a9}{'e} % LATIN SMALL LETTER E WITH ACUTE \DoUTFCombination{00EA}{c3}{^^aa}{^e} % LATIN SMALL LETTER E WITH CIRCUMFLEX \DoUTFCombination{00EB}{c3}{^^ab}{"e} % LATIN SMALL LETTER E WITH DIAERESIS \DoUTFCombination{00EC}{c3}{^^ac}{`\i} % LATIN SMALL LETTER I WITH GRAVE \DoUTFCombination{00ED}{c3}{^^ad}{'\i} % LATIN SMALL LETTER I WITH ACUTE \DoUTFCombination{00EE}{c3}{^^ae}{^\i} % LATIN SMALL LETTER I WITH CIRCUMFLEX \DoUTFCombination{00EF}{c3}{^^af}{"\i} % LATIN SMALL LETTER I WITH DIAERESIS \DoUTFCombination{00F0}{c3}{^^b0}{\dh} % LATIN SMALL LETTER ETH \DoUTFCombination{00F1}{c3}{^^b1}{~n} % LATIN SMALL LETTER N WITH TILDE \DoUTFCombination{00F2}{c3}{^^b2}{`o} % LATIN SMALL LETTER O WITH GRAVE \DoUTFCombination{00F3}{c3}{^^b3}{'o} % LATIN SMALL LETTER O WITH ACUTE \DoUTFCombination{00F4}{c3}{^^b4}{^o} % LATIN SMALL LETTER O WITH CIRCUMFLEX \DoUTFCombination{00F5}{c3}{^^b5}{~o} % LATIN SMALL LETTER O WITH TILDE \DoUTFCombination{00F6}{c3}{^^b6}{"o} % LATIN SMALL LETTER O WITH DIAERESIS \DoUTFCombination{00F7}{c3}{^^b7}{\tcsym{"F6}} % DIVISION SIGN \DoUTFCombination{00F8}{c3}{^^b8}{\o} % LATIN SMALL LETTER O WITH STROKE \DoUTFCombination{00F9}{c3}{^^b9}{`u} % LATIN SMALL LETTER U WITH GRAVE \DoUTFCombination{00FA}{c3}{^^ba}{'u} % LATIN SMALL LETTER U WITH ACUTE \DoUTFCombination{00FB}{c3}{^^bb}{^u} % LATIN SMALL LETTER U WITH CIRCUMFLEX \DoUTFCombination{00FC}{c3}{^^bc}{"u} % LATIN SMALL LETTER U WITH DIAERESIS \DoUTFCombination{00FD}{c3}{^^bd}{'y} % LATIN SMALL LETTER Y WITH ACUTE \DoUTFCombination{00FE}{c3}{^^be}{\th} % LATIN SMALL LETTER THORN \DoUTFCombination{00FF}{c3}{^^bf}{"y} % LATIN SMALL LETTER Y WITH DIAERESIS

%%%% Other characters omitted

\endinput

plain-t1.tex

\catcode`@=11

\input utfplainmac

\message{Loading EC fonts}

\font\tenrm=ecrm1000 % roman text \font\tctenrm=tcrm1000 % \font\sevenrm=ecrm0700 % \font\fiverm=ecrm0500

\font\tenbf=ecbx1000 % boldface extended \font\tctenbf=tcbx1000 % \font\sevenbf=ecbx0700 % \font\fivebf=ecbx0500

\font\tentt=ectt1000 % typewriter \font\tctentt=tctt1000

\font\tensl=ecsl1000 % slanted roman \font\tctensl=tcsl1000

\font\tenit=ecti1000 % text italic \font\tctenit=tcti1000

% \font\tenrm=ptmr8t % roman text % \font\sevenrm=ptmr8t at 7pt % \font\fiverm=ptmr8t at 5pt

% \font\tenbf=ptmb8t % boldface extended % \font\sevenbf=ptmb8t at 7pt % \font\fivebf=ptmb8t at 5pt

% \font\tentt=pcrr8t % typewriter

% \font\tensl=ptmro8t % slanted roman

% \font\tenit=ptmri8t % text italic

% \textfont0=\tenrm \scriptfont0=\sevenrm \scriptscriptfont0=\fiverm % \textfont1=\teni \scriptfont1=\seveni \scriptscriptfont1=\fivei % \textfont2=\tensy \scriptfont2=\sevensy \scriptscriptfont2=\fivesy % \textfont3=\tenex \scriptfont3=\tenex \scriptscriptfont3=\tenex % \textfont\itfam=\tenit % \textfont\slfam=\tensl % \textfont\bffam=\tenbf \scriptfont\bffam=\sevenbf % \scriptscriptfont\bffam=\fivebf % \textfont\ttfam=\tentt

\def\rm{\fam\z@\let\tcfont\tctenrm\tenrm} \def\it{\fam\itfam\let\tcfont\tctenit\tenit} \def\sl{\fam\slfam\let\tcfont\tctensl\tensl} \def\bf{\fam\bffam\let\tcfont\tctenbf\tenbf} \def\tt{\fam\ttfam\let\tcfont\tctentt\tentt}

% set the font \rm

\catcode`@=11

% special characters \chardef\pound="BF \chardef\IJ="9C \chardef\ij="BC \chardef\L="8A \chardef\l="AA \chardef\DH="D0 \chardef\dh="F0 \chardef\TH="DE \chardef\th="FE \chardef\NG="8D \chardef\ng="AD \chardef\AA="C5 \chardef\aa="E5 \chardef\AE="C6 \chardef\ae="E6 \chardef\OE="D7 \chardef\oe="F7 \chardef\O="D8 \chardef\o="F8 \chardef\SS="DF \chardef\ss="FF \chardef\i="19 \chardef\j="1A \let\DJ=\DH \chardef\dj="9E

\def@firstoftwo#1#2{#1} \def@secondoftwo#1#2{#2} \def@ifundefined#1{\expandafter\ifx\csname#1\endcsname\relax \expandafter@firstoftwo\else\expandafter@secondoftwo\fi}

%%% \make@ec@accent is syntactic sugar; for example %%% \make@ec@accent\x{abc} is equivalent to %%% %%% \def\x#1{@ifundefined{ec@abc@\detokenize{#1}} %%% {\csname ec@abc\endcsname{#1}}{\csname ec@abc@#1\endcsname}}} %%% %%% Thus a call like \x{y} looks whether \ec@abc@y is defined; if it %%% is, then use it, otherwise resort to \ec@abc{y}, where \ec@abc is %%% the general accent command. In this way we can define \x{y} to %%% print a single character, for hyphenation purposes, for example. \def\make@ec@accent#1#2{% \def#1##1{@ifundefined{ec@#2@\detokenize{##1}} {\csname ec@#2\endcsname{##1}}{\csname ec@#2@##1\endcsname}}} \make@ec@accent`{grave} \make@ec@accent'{acute} \make@ec@accent^{circumflex} \make@ec@accent~{tilde} \make@ec@accent"{dieresis} \make@ec@accent\H{doubleacute} \make@ec@accent\r{ring} \make@ec@accent\v{caron} \make@ec@accent\u{breve} \make@ec@accent={macron} \make@ec@accent.{dotabove} \make@ec@accent\c{cedilla} \make@ec@accent\k{ogonek}

%%% Now we define the accents; for example \ec@grave is defined as it %%% is ` in Plain TeX, except for the code point of the accent. But %%% we define also \ec@grave@A to print just a character which will %%% then participate to hyphenation and kerning. The same for all %%% other characters which are available in T1 encoded fonts. In some %%% special cases we provide also some complicated definition, to %%% cover peculiar situation (like \c{g}, where the cedilla should go %%% over the g).

% grave accent \def\ec@grave#1{{\accent"0 #1}} \chardef\ec@grave@A="C0 \chardef\ec@grave@a="E0 \chardef\ec@grave@E="C8 \chardef\ec@grave@e="E8 \chardef\ec@grave@I="CC \chardef\ec@grave@i="EC \chardef\ec@grave@O="D2 \chardef\ec@grave@o="F2 \chardef\ec@grave@U="D9 \chardef\ec@grave@u="F9

% acute accent \def\ec@acute#1{{\accent"1 #1}} \chardef\ec@acute@A="C1 \chardef\ec@acute@a="E1 \chardef\ec@acute@E="C9 \chardef\ec@acute@e="E9 \chardef\ec@acute@I="CD \chardef\ec@acute@i="ED \chardef\ec@acute@C="82 \chardef\ec@acute@c="A2 \chardef\ec@acute@L="88 \chardef\ec@acute@l="A8 \chardef\ec@acute@N="8B \chardef\ec@acute@n="AB \chardef\ec@acute@O="D3 \chardef\ec@acute@o="F3 \chardef\ec@acute@R="8F \chardef\ec@acute@r="AF \chardef\ec@acute@S="91 \chardef\ec@acute@s="B1 \chardef\ec@acute@U="DA \chardef\ec@acute@u="FA \chardef\ec@acute@Z="99 \chardef\ec@acute@z="B9

% circumflex accent \def\ec@circumflex#1{{\accent"2 #1}} \chardef\ec@circumflex@A="C2 \chardef\ec@circumflex@a="E2 \chardef\ec@circumflex@E="CA \chardef\ec@circumflex@e="EA \chardef\ec@circumflex@I="CE \chardef\ec@circumflex@i="EE \chardef\ec@circumflex@O="D4 \chardef\ec@circumflex@o="F4 \chardef\ec@circumflex@U="DB \chardef\ec@circumflex@u="FB

% tilde accent \def\ec@tilde#1{{\accent"3 #1}} \chardef\ec@tilde@A="C3 \chardef\ec@tilde@a="E3 \chardef\ec@tilde@N="D1 \chardef\ec@tilde@n="F1 \chardef\ec@tilde@O="D5 \chardef\ec@tilde@o="F5

% dieresis \def\ec@dieresis#1{{\accent"4 #1}} \chardef\ec@dieresis@A="C4 \chardef\ec@dieresis@a="E4 \chardef\ec@dieresis@E="CB \chardef\ec@dieresis@e="EB \chardef\ec@dieresis@I="CF \chardef\ec@dieresis@i="EF \chardef\ec@dieresis@O="D6 \chardef\ec@dieresis@o="F6 \chardef\ec@dieresis@U="DC \chardef\ec@dieresis@u="FC \chardef\ec@dieresis@Y="98 \chardef\ec@dieresis@y="A8

% double acute (hungarian umlaut) \def\ec@doubleacute#1{{\accent"5 #1}} \chardef\ec@doubleacute@O="8E \chardef\ec@doubleacute@o="AE \chardef\ec@doubleacute@U="97 \chardef\ec@doubleacute@u="B7

% ring \def\ec@ring#1{{\accent"6 #1}} % \chardef\ec@ring@A="C5 % \chardef\ec@ring@a="E5 \chardef\ec@ring@U="97 \chardef\ec@ring@u="B7

% caron \def\ec@caron#1{{\accent"7 #1}} \chardef\ec@caron@C="83 \chardef\ec@caron@c="A3 \chardef\ec@caron@D="84 \chardef\ec@caron@d="A4 \chardef\ec@caron@E="85 \chardef\ec@caron@e="A5 \chardef\ec@caron@L="89 \chardef\ec@caron@l="A9 \chardef\ec@caron@N="8C \chardef\ec@caron@n="AC \chardef\ec@caron@R="90 \chardef\ec@caron@r="B0 \chardef\ec@caron@S="92 \chardef\ec@caron@s="B2 \chardef\ec@caron@T="94 \chardef\ec@caron@t="B4 \chardef\ec@caron@Z="9A \chardef\ec@caron@z="BA

% breve \def\ec@breve#1{{\accent"8 #1}} \chardef\ec@breve@G="87 \chardef\ec@breve@g="A7

% macron \def\ec@macron#1{{\accent"9 #1}}

% dot above \def\ec@dotabove#1{{\accent"A #1}} \chardef\ec@dotabove@Z="9B \chardef\ec@dotabove@z="BB

% cedilla \def\ec@cedilla#1{{\setbox\z@\hbox{#1}\ifdim\ht\z@=1ex\accent"0B #1% \else\ooalign{\unhbox\z@\crcr\hidewidth\char"0B\hidewidth}\fi}} \chardef\ec@cedilla@C="C7 \chardef\ec@cedilla@c="E7 \chardef\ec@cedilla@S="93 \chardef\ec@cedilla@s="B3 \chardef\ec@cedilla@T="95 \chardef\ec@cedilla@t="B5 \def\ec@cedilla@g{\accent\g}

% ogonek \def\ec@ogonek#1{{\ooalign{\null#1\crcr\hidewidth\char"0C\hidewidth}}} \chardef\ec@ogonek@A="81 \chardef\ec@ogonek@a="A1 \chardef\ec@ogonek@E="86 \chardef\ec@ogonek@e="A6 %%% lowercase u is special \def\ec@ogonek@u{{\ooalign{\null u\crcr\hidewidth\char"0C}}}

% bar under \def\b#1{{\o@lign{\relax#1\crcr\hidewidth\sh@ft{-3ex}% \vbox to.2ex{\hbox{\char"09}\vss}\hidewidth}}}

%%% A special purpose macro % catalan dot \def\c@talandot#1{\kern#1em\llap{$\m@th\cdot$}\kern-#1em} \def\Lmiddledot{L\c@talandot{-.1}} \def\lmiddledot{l\c@talandot{.15}}

\catcode`@=12

\endinput

test.tex

% -*- coding: utf-8 -*-
\input plain-t1

Here are some characters: $\alpha \Gamma$ æ Æ 'x

Ǎ Ǹ ă ŭ ā ē ü ł Ł ý ß Ş Ģ \c{g} ø ǽ Ǽ Ů ů ¡ ¿ ę Ą Ǫ ǫ Ų ų Į į

ĿL Ŀl ŀl

Český Krumlov (německy Böhmisch Krumau, popřípadě Krummau) je okresní město v Jihočeském kraji, zhruba 25 km jižně od Českých Budějovic. Rozkládá se pod hřebenem Blanského lesa a protéká jím řeka Vltava. Jedná se o významné turistické centrum Jižních Čech. Středověké centrum města, které obklopuje meandry Vltavy, je od roku 1963 městskou památkovou rezervací a od roku 1992 je zapsáno na seznamu světového dědictví UNESCO. V roce 2003 bylo městskou památkovou zónou vyhlášeno předměstí Plešivec (jižně od historického jádra).

Český Krumlov é uma cidade República Checa na lista da UNESCO como Patrimônio da Humanidade. Se encontra na Boêmia do Sul (região), é a capital antiga da região de Rosenberg, a nobreza mais rica e influente do país. A construção da cidade e seu Castelo começou no Século XIII. A população da cidade em 2005 era de 13942 habitantes, e a área de uns 22 km².

Český Krumlov (13.942 abitanti) è una città della Boemia meridionale, in Repubblica Ceca, molto conosciuta per la raffinata architettura del centro storico e per il Castello. Era conosciuta come Krumau fino alla Seconda guerra mondiale quando alla fine furono espulsi gli abitanti di lingua tedesca. Český Krumlov letteralmente significa ``Krumlov Ceca (Boema)''; ne esiste infatti anche una morava.

\UseUnicodeCharacter{00C8}

\bye

The omitted list (messages here are limited to 30000 characters) shouldn't cause the test example to go wrong.

Compile test.tex with pdftex.

I have also a plain-cmu.tex file that sets up fonts to use the CMUnicode OpenType fonts.

enter image description here

Hope this is useful.

Tássio
  • 544
egreg
  • 1,121,712
  • @Matteo Maybe I'll try polishing and uploading it. – egreg Mar 06 '14 at 10:28
  • I've noticed a thing only now: Why are accents in test.tex different from accents I get in another document (that does not input plain-t1) by writing ` or ' ? Is there a way to "solve" it? I prefer the appearance of the latter (less tilted). – User Apr 14 '14 at 16:57
  • @Matteo This is too vague, sorry. – egreg Apr 14 '14 at 17:43
  • Why too vague? Check the picture, or try yourself to see the difference: http://it.tinypic.com/view.php?pic=24do9ib&s=8#.U0wsgKaLe1E

    The first is made with the backslash command; the second with your macro

    – User Apr 14 '14 at 18:45
  • Sorry, but I don't understand. You're saying you're not using plain-t1 (which, by the way, produces the shape you like): use plain-t1. – egreg Apr 14 '14 at 21:06
  • I would like to use plain-t1, because my typing would be much more simpler. But, if I use plain-t1 I get the character on the right (in the picture), which is different in shape from the character on the left (in the same picture), the one I was used to have before using plain-t1. I prefer the one on the left – User Apr 15 '14 at 11:44
  • @Matteo I continue not understanding; you *can* load plain-t1.tex and type \'e for getting “é”. – egreg Apr 15 '14 at 12:07
  • If I load plain-t1.tex, then é and \'e make the same output (the one I don't like); if instead I don't load plain-t1.tex, then \'e make the output I like. I'd like to get the second output by typing é. – User Apr 15 '14 at 12:42
  • I've solved the problem in another way, without using plain-t1.tex. I wrote a program that searches in my .tex file all occurrences of è, é, ò,... and substitutes them with the right control sequences \`e, \'e, \`o,... Then I wrote a script that executes both my substitution program and tex. So, before my .tex file gets processed by the tex engine, all substitutions are done. – User Apr 16 '14 at 17:59
  • Check the new "answer" – User Jan 04 '16 at 21:34
  • @egreg Does the header % -*- coding: utf-8 -*- do anything? I know it from Python and there it is actually respected by the interpreter, but I don’t think that’s the case for TeX. – Henri Menke Dec 26 '16 at 18:56
  • 2
    @HenriMenke That's for Emacs – egreg Dec 26 '16 at 19:00
  • Is it ok to re-distributed modified version of "utfplainmac.tex" and "plain-t1.tex"? Is there a licence for these files? – jochen Jul 23 '18 at 07:14
  • @jochen Do whatever you want with them. – egreg Jul 23 '18 at 07:16
  • Hi, @egreg. I tried using your code in this question, but I went into trouble. Would you be so kind to consider answering the question? I appreciate that! – Joep Awinita Feb 22 '19 at 19:25
  • With: éléphant \uppercase{éléphant} \bye i get éléphant éLéPHANT. how to fix this? – touhami Dec 24 '19 at 11:24
  • @touhami Not supported, see what LaTeX does with \MakeUppercase. – egreg Dec 24 '19 at 11:56
  • thank you, i'll do :-) – touhami Dec 24 '19 at 11:58
  • how can one extend this to arabic letters? – touhami May 18 '21 at 08:13
  • @touhami Sorry, but I know nothing about Arabic and I wouldn’t dare to say anything.. – egreg May 18 '21 at 08:38
  • OK, understood. Thank you. – touhami May 18 '21 at 09:31
  • This is amazing! I think I spotted a minor blip: I believe that LATIN SMALL LETTER I WITH ACUTE should be \'\i rather than \'i (and the same for the other lowercase i's. Compiling on my machine gives me the accents over the i with a dot otherwise. – Tássio May 04 '23 at 01:31
8

For comparison, here is an example with XeTeX:

\font\bodyfont="Baskerville" % or whatever you have
\bodyfont

La vérité vaut bien qu'on passe quelques années sans la trouver.

\bye
morbusg
  • 25,490
  • 4
  • 81
  • 162
  • How to use also the slanted version of a font, like in our case of Baskerville? If I just type \sl, then I get the slanted Computer Modern font. – User Mar 06 '14 at 09:51
  • @Matteo: XeTeX-plain uses the normal plain.tex as base, which defines \font\tensl=cmsl10 for use within \sl. IMHO having the prefix ten in font names can be a little iffy, so I suggest something like \font\slantedbodyfont="Baskerville:slant=.15" and if you want to use \sl for activation, just redefine it with, for example, \def\sl{\slantedbodyfont}. Note that for maths, you'd need to go through some lengthy redefinitions. I have something written for that which I can share if you want to. – morbusg Mar 06 '14 at 12:28
  • 2
    @Matteo: Here is a link for my math definitions (and a little other stuff in there, too) – morbusg Mar 06 '14 at 12:38