Include files with different input encoding

Question

How can I include a "~.tex"-file with a different input encoding? I use UTF-8 encoding in my main file but have to include a file with ISO-8859-1 encoding (sink output of RStudio, see also related question: https://stackoverflow.com/questions/38955337/use-sink-with-utf-8-encoding)

\documentclass[12pt]{article}
\usepackage{lmodern}
\usepackage[ngerman]{babel}

\begin{document}

\include{Umlaute}       % ISO-8859-1 encoding

\end{document}

(to be processed by luatex)

Heiko Oberdiek · Accepted Answer · 2016-08-16T17:49:41.093

8-bit TeX engines (pdfTeX, TeX)

\inputencoding from package inputenc can also be used inside the document:

\documentclass[12pt]{article}
\usepackage{lmodern}
\usepackage[ngerman]{babel}

\usepackage[utf8]{inputenc}

\begin{document}

\inputencoding{latin1}
\include{Umlaute}       % ISO-8859-1 encoding
\inputencoding{utf8} % back to UTF-8

\end{document}

XeTeX

The input encoding can be specified with \XeTeXinputencoding in the correct file. Thus, Umlaute.tex starts with:

% Umlaute.tex
\XeTeXinputencoding ISO-8859-1

The syntax of \XeTeXinputencoding is quite obscure (missing documentation). From the source code:

The argument is scanned like a file name.
It can be surrounded by single or double quotes.
Then the name (without quotes) is checked in a case insensitive manner against the strings auto, utf8, utf16, utf16be, utf16le, bytes. (Source: XeTeX_ext.c, method getencodingmodeandinfo). If the name is not such a predefined name, then the name is passed to ucnv_open (ICU converter). From its documentation:

The actual name will be resolved with the alias file using a case-insensitive string comparison that ignores leading zeroes and all non-alphanumeric characters. E.g., the names UTF8, utf-8, u*T@f08 and Utf 8 are all equivalent.

The previous version with curly braces has worked, because the name was not a predefined name, but was passed to ucnv_open, which filtered the curly braces out. Also the funny \XeTeXinputencoding}ISO-88;591{ would have worked.

Unicode-Engines (LuaTeX, XeTeX)

I would recode the non-UTF-8 files, e.g. (bash/linux):

recode latin1..utf8 Umlaute.tex

@egreg \XeTeXinputencoding removes single and double quotes around the name. If the name is not a predefined name (see updated answer), then non-alphanumeric characters are filtered out, including curly braces. — Heiko Oberdiek, Aug 16 '16 at 17:51
@daleif recode latin1..utf8 is shorter than iconv -f latin1 -t utf8. — Heiko Oberdiek, Aug 16 '16 at 17:58

score 1 · Answer 2 · answered Dec 06 '20 at 21:40

I never got the "inputencoding"-method to work for my external text file. If you look at the link below you can see the code I used for adding support to Swedish and German characters (åäö üß) as well as Portuguese characters read from an external text file.

Link: Having problems with listings and UTF-8. Can it be fixed?

Include files with different input encoding

2 Answers2

8-bit TeX engines (pdfTeX, TeX)

XeTeX

Unicode-Engines (LuaTeX, XeTeX)