How does LaTeX implement UTF-8?
The Unicode character é is encoded as two byte in UTF-8, precisely <C3><A9> (I'll use throughout this to denote bytes, also when they are character tokens for TeX). When \usepackage[utf8]{inputenc} is loaded, the byte <C3> is made active and defined to look for the following byte, because <C3> in UTF-8 marks a two byte character.
So LaTeX gathers <A9> and forms the control sequence
\csname u8:\string<C3>\string<A9>\endcsname
which is defined to expand to
\IeC {\@tabacckludge 'e}
One can see it from
\documentclass{article}
\usepackage[utf8]{inputenc}
\begin{document}
\expandafter\show\csname u8:\string^^c3\string^^a9\endcsname
where ^^c3 is TeX's way to express what I denote by <C3>. On the terminal we get
> \u8:é=macro:
->\IeC {\@tabacckludge 'e}.
<recently read> \u8:é
l.4 ...r\show\csname u8:\string?\string?\endcsname
(the é in the first line is because my terminal is set up for UTF-8).
What does \write do?
The operation \write takes a first argument denoting the output stream and a braced second argument, which is fully expanded when the write operation is actually performed. So we need to know what \IeC and \@tabacckludge do.
Adding \show\IeC and \makeatletter\show\@tabacckludge to the above example shows, on the terminal, first
> \IeC=macro:
->\ifx \protect \@typeset@protect \expandafter \@firstofone \else \noexpand \IeC \fi .
and then
> \@tabacckludge=macro:
#1->\expandafter \@changed@cmd \csname \string #1\endcsname \relax .
OK, we'd need also \@changed@cmd, but in essence it simply does the equivalent of \'e, since we're not in a tabbing environment.
In your case, \protect is \@typeset@protect, as it is normally; so when we do
\write\openout{é}
we first get
\IeC{\@tabacckludge 'e}
and, since the conditional is true, this becomes
\@firstofone{\@tabacckludge 'e}
which in turn becomes
\@tabacckludge 'e
and then
\'e
This one triggers a complex development, which eventually ends into
\char223
because of the declaration
\DeclareTextComposite{\'}{T1}{e}{233}
in t1enc.def that has been loaded by saying \usepackage[T1]{fontenc}. Only now TeX actually writes something, precisely byte number 233 (in decimal), that is, byte <E9>.
It's not really a coincidence that <E9> in Latin-1 is exactly é, because the T1 encoding has many slots in common with Latin-1. Not all.
How do we write UTF-8 with LaTeX (as opposed to (Xe|Lua)LaTeX)?
You don't want the expansion takes place:
\write\outtmp{\unexpanded{Résumé}}
or, without using \unexpanded,
\toks0={Résumé}
\write\outtmp{\the\toks0}
Example
\documentclass{article}
\usepackage[utf8]{inputenc}
\usepackage[T1]{fontenc}
\begin{document}
\newwrite\outtmp
\immediate\openout\outtmp=\jobname.tmp
\immediate\write\outtmp{Résumé}
\immediate\write\outtmp{\unexpanded{Résumé}}
\toks0={Résumé}
\immediate\write\outtmp{\the\toks0 }
\stop
The result of less from the written out file is
R<E9>sum<E9>
Résumé
Résumé
(always because the terminal is UTF-8). Without interpretation I get
R<E9>sum<E9>
R<C3><A9>sum<C3><A9>
R<C3><A9>sum<C3><A9>
So the first line is the wrong one, while the other two are as expected.
Hiding Résumé in a macro just makes things more difficult, because you want to expand it. So
\write\outtmp{\unexpanded\expandafter{\foo}}
will do.
What else?
If you use \protected@write, then things are different: with
\protected@write\outtmp{}{Résumé}
you get written
R\IeC {\'e}sum\IeC {\'e}
because in this case \protect is not \@typeset@protect, so the false branch is followed. The complex transformation of \@tabacckludge 'e ends up with \'e because of the same reason regarding \protect. This might be or not what you want. Surely that token list prints as “Résumé”.
inputencwithluainputencand translate it with lualatex, then the out.tmp is UTF8-encoded. – knut Dec 08 '13 at 21:57inputencandfontencbyfontspec. – Denis Bitouzé Dec 08 '13 at 22:08pdflatex. I guess you meant it is not an 8-bit program, don't you? – Denis Bitouzé Dec 08 '13 at 22:10utf-8is an "8-bit format" in the sense that it uses all 8 bits of a byte (as opposed to ASCII, which is a 7-bit encoding because it never uses the high bit). But utf-8 needs multiple bytes per character (there's no way the thousands of characters of unicode could be mapped to the 256 8-bit values). Unicode is nominally 16-bit (though the "extended plane" goes higher), and utf-8 uses anywhere from 1 to three bytes (8 to 24 bits) to code one unicode character. – alexis Dec 08 '13 at 23:20inputencfor most encodings: some characters are made active to translate characters into latex notation. If that were not the case, it would happily write multibyte characters byte-by-byte. – Dan Dec 09 '13 at 02:32