11

\immediate\write\SomeStream{x} writes x to the file open in \SomeStream. I would like to write non-printable ASCII characters, such as ``, to a file. My naive guess is

\begingroup         %To keep the catcode change local
\catcode`\=11
\immediate\write\SomeStream{}
\endgroup

But this writes the three characters ^^A instead of the single character ``. Is there a way to prevent TeX from sanitizing its output?

MWE:

\documentclass{minimal}
\newwrite\SomeStream
\immediate\openout\SomeStream NonPrintableASCII.test

\begingroup         %To keep the catcode change local
\catcode`\=11
\immediate\write\SomeStream{}
\endgroup

\immediate\closeout\SomeStream
\begin{document}
\end{document}

EDIT: My goal was to write a file and reread it with potentially crazy catcode changes (And I am using non-printable characters rather than ^^A to be more robust.). \scantokens does the job (see below).

3 Answers3

9

In some of the TeX engines, no way. It depends on the TeX engine you use. Some of them can use a translate file (foo.tcx) to do this. For example, pdfTeX:

pdflatex -translate-file=natural.tcx foo.tex
Leo Liu
  • 77,365
  • thank you, I have found a different solution to my particular problem, but your answer is good! – Bruno Le Floch Jan 12 '11 at 09:22
  • This works great with \write\outputfile{^^A} etc., but is it possible to avoid the line break added by the \write? – Martin Scharrer Jun 28 '11 at 19:33
  • 1
    @Martin: In TeX every \write writes a text line to the file. It automatically adds a 0x0A at end of a line. It seems that TeX cannot write a binary file. Even input is only possible with pdfTeX's \pdffiledump. Maybe you can store the bytes into a macro and write them at one time, but it's not convenient. For binary files, LuaTeX or \write18 may help when necessary. – Leo Liu Jun 29 '11 at 04:56
  • @LeoLiu: this does not work for the byte 127 (^^?), which is missing from natural.tcx. A better solution, pointed out to me by Hans Hagen, is to use the flag -8bit in pdftex. – Bruno Le Floch Jan 01 '12 at 19:22
4

I know that the following solves the task with every available tex engine with the following pattern:


\begingroup
\count0=0
\countdef\counter=0
\catcode`\^^00=11   \expandafter\xdef\csname pgfp@bin@\the\counter \endcsname{^^00}\advance\counter by1
catcode`\^^01=11    \expandafter\xdef\csname pgfp@bin@\the\counter \endcsname{^^01}\advance\counter by1
\catcode`\^^02=11   \expandafter\xdef\csname pgfp@bin@\the\counter \endcsname{^^02}\advance\counter by1
\endgroup

then use \csname pgfp@bin@0\endcsname to get the binary char 0, \csname pgfp@bin@1\endcsname for the binary char 1 and so on. It becomes a real mess with those characters which have a meaning in TeX, though (but I don't see another way around that problem). But I am sure you can adapt it to your application.

The code above is actually an extract of pgfplotsbinary.code.tex -- the pgfplots package uses it to generate low level shadings. If needed, you can copy-paste the special handling for TeX characters from that file.

It also has a "public" interface which is ready to use. I copy-paste its API here:


% Returns a single character, which has the
% binary ASCII code '#1', with catcode 11.
%
% #1 (expands to) a number between 0 and 255 (inclusive).
%
% @see \pgfplotsgetchar Note that \pgfplotsgetchar is more powerful,
% but can't be used inside of \edef (it is not expandable) whereas
% \pgfplotscharno is.
\def\pgfplotscharno#1{\csname pgfp@bin@#1\endcsname}%

% Defines \pgfplotsretval to be the ASCII character for #1, with
% catcode 11.
%
% #1: either a number between 0 and 255 (inclusive) or a description
% of the character.
%
% Examples:
% \pgfplotsgetchar{35}
% \pgfplotsgetchar{`\#}   % code for '#'
% \pgfplotsgetchar{`\^^M} % Newline
% \pgfplotsgetchar{`\^^ff}% 255
%
% @see \pgfplotscharno
\def\pgfplotsgetchar#1{...}

I hope this helps.

  • 2
    this works fine within a single TeX run, but try writing to a file with \newwrite\mywrite\immediate\openout\mywrite\immediate\write\mywrite{\csname pgf@bin@1\endcsname}\immediate\closeout\mywrite, and you will see that it produces 3 bytes. – Bruno Le Floch Apr 19 '11 at 10:57
  • 1
    OK... perhaps I misunderstood the problem then. The code above worked fine when I wrote the output directly into a pdf stream (for example with \immediate\pdfobj streamin the pdftex driver). I have just tried

    \pgfplotsgetchar{1} Here is it:\pgfplotsretval' \newwrite\mywrite \immediate\openout\mywrite=P.dat \immediate\write\mywrite{\pgfplotsretval} \immediate\closeout\mywrite`

    and it resulted it 9.746 529.281 Td [(Here)-333(is)-334(it:)-333(`\001')]TJ in the pdf text part... but as you said, the \write generated escape chars.

    – Christian Feuersänger Apr 20 '11 at 17:54
  • This works in combination with Leo Liu's answer. It actually solves the issue of handling the otherwise forbidden or otherwise used "characters". However, I don't understand why \xdef and not simple \gdef is used. – Martin Scharrer Jun 28 '11 at 19:47
  • @Martin thinking about it, I do not understand it either. Perhaps I was unsure about it when I wrote the code and did never revise it because it worked as expected... apparently, \gdef will do the job as well (perhaps slightly more efficient). – Christian Feuersänger Jun 30 '11 at 17:43
  • Note that I finally had to realize that my approach does not work at all for lualatex: it does not accept binary content this way (only via LUA macros). – Christian Feuersänger Apr 03 '12 at 19:21
3

I asked for a way to write non-printable characters to a file, and Leo Liu's answer gives a way to do that :).

Since my goal was in fact to reread the file in TeX, protecting it against changes in the catcode of ^, there is another way. The eTeX primitive \scantokens rereads its argument (this is almost equivalent to writing to a file and rereading with possibly different catcodes).

To show how it works, we first make ^^A active, and define it to something.

\documentclass{minimal}
\begin{document}
\catcode`\^^A=13
\def^^A{text}

Then we use it in a definition, and check that \scantokens does what it should: the line expands to Some text.

\def\foo{Some ^^A}
\expandafter\scantokens\expandafter{\foo}

Finally, we change the catcode of ^ to invalid (15), and repeat the process. The same Some text is typeset.

\catcode`\^=15\relax
\expandafter\scantokens\expandafter{\foo}
\end{document}

The fact that TeX does not complain about the characters ^^ is that they have in fact disappeared: ^^A was changed to one character when defining \foo.