15

Suppose I've got a poem.tex source file.

Is there a way to include a hash of the source file as text in the final PDF? I usually use pdflatex to generate PDF files.

I am looking for something like the result of shasum poem.tex, looking like this:

c0c0fa9a776ea1891264c884e0e69dbde6142c4a,

being added somewhere to the body of the output file of the pdflatex poem.tex command.

Is there an easy way to do it?

Mateusz Piotrowski
  • 357
  • 1
  • 5
  • 16
  • Related thread with a lot of useful methods: http://tex.stackexchange.com/questions/252566/calculate-the-hash-md5-or-otherwise-of-a-string – Mateusz Piotrowski Sep 01 '16 at 10:22
  • For what purpose? A hash of a source file might have no relationship to the hash of text in a .PDF after the source is included in the .PDF and later extracted. It could depend on exactly how you expect the hash to be used, possibly even on the platform used to view the .PDF. – user2338816 Sep 02 '16 at 01:29
  • @user2338816, The file is a set of mathematical equations and theorems. It will be forked, modified and used by a lot of people. The changes might be hard to spot at first glance, so I wanted to have an indicator of what version of the file when it gets printed. I do not use any VCS yet, so Git or SVN is not an option. – Mateusz Piotrowski Sep 02 '16 at 09:23
  • That sounds more like a simple version control number rather than a hash. You might consider using one that way. By using a hash, it implies that users can obtain the same value by running the same hash function, and that might not work often. – user2338816 Sep 02 '16 at 21:56
  • @user2338816 The point is that I don't want to bother with changing the version number every time I modify the file. – Mateusz Piotrowski Sep 02 '16 at 21:57

3 Answers3

17

pdfTeX contains the primitive \pdfmdfivesum for this purpose, available in recent XeTeX as \mdfivesum and implemented in Lua for LuaTeX in the pdftexcmds package. Using the latter as a wrapper we might have

\documentclass{article}

\usepackage{blindtext}
\usepackage{fancyhdr}
\usepackage{pdftexcmds}
\makeatletter
\ifx\pdf@filemdfivesum\undefined\def\pdf@filemdfivesum#{\mdfivesum file}\fi
\let\filesum\pdf@filemdfivesum
\makeatother
\pagestyle{fancy}
\fancyhf{}

\cfoot{\filesum{\jobname}}

\begin{document}
\blindtext[5]
\end{document}

The primitive syntax (if we assume pdfTeX/XeTeX):

\documentclass{article}

\usepackage{blindtext}
\usepackage{fancyhdr}
\makeatletter
\ifx\pdfmdfivesum\undefined
  \let\pdfmdfivesum\mdfivesum
\fi
\edef\filesum{\pdfmdfivesum file {\jobname}}
\makeatother
\pagestyle{fancy}
\fancyhf{}

\cfoot{\filesum}

\begin{document}
\blindtext[5]
\end{document}
GuM
  • 21,558
Joseph Wright
  • 259,911
  • 34
  • 706
  • 1,036
  • This does not work with \include or \input'ed files, though. :( – lindhe May 14 '18 at 07:26
  • @lindhe Hashing all of the files is of course more tricky, but the underlying idea is the same. You just have to arrange to e.g. modifying \input to has the file as it opens it. – Joseph Wright May 14 '18 at 07:52
  • @lindhe: You can use the filehook package to hook into \input etc. to calculate a hash of the subfile. I would suggest accumulating all hashes and then calculate a final hash over them. – Martin Scharrer Jan 17 '19 at 07:58
8

This stores the checksum in variable \shasum. All the \string\\\string\\ is needed to get \\\\ into the shell stream. Option --shell-escape is needed of course. The advantage of this solution is that the result is stored in a macro, which is more natural IMHO.

\documentclass{article}

\makeatletter
\begingroup
\catcode`\%=11
\immediate\write18{printf "\string\\\string\\edef\string\\\string\\shasum{%s}" `shasum \jobname.tex | awk '{print $1}'` > \jobname.sha}
\endgroup
\input{\jobname.sha}
\makeatother

\begin{document}

The checksum is \shasum.

\end{document}
yo'
  • 51,322
  • This is only needed with Knuth's TeX; see the answer by Joseph. – Martin Schröder Sep 06 '16 at 20:38
  • @MartinSchröder And that's a reason for a downvote? :-O – yo' Sep 06 '16 at 20:54
  • @yo' See the same comment on my post ;-) Ridiculous! –  Sep 06 '16 at 20:55
  • @yo': The answer is of course o.k. but completely ignores the command built into pdfTeX and LuaTeX since ca. 10 years (and XeTeX). As such I deem it not useful. – Martin Schröder Sep 07 '16 at 08:23
  • 4
    @MartinSchröder well, unlike the other answer, this one allows for arbitrary has functions as long as they're available as linux shell tools, not only MD5. So while the other answer is better, this one has its value as well. – yo' Sep 07 '16 at 09:37
  • 1
    If I can add a consideration, consider that MD5 is deprecated on many levels nowadays. If the checksum is included in the output for security reasons, an MD5 hash is completely useless. Using a SHA checksum can be recommended, even if pdfTeX implements MD5 internally. – Nicola Gigante Mar 18 '17 at 21:31
6

An alternative Linux/Unix - shell-escape based approach (although there is \pdfmdfivesum file {yourfilename})

It writes the hash to a file and reads it back to the original file.

\documentclass{article}

\usepackage{blindtext}
\usepackage{fancyhdr}
\pagestyle{fancy}
\fancyhf{}

\AtBeginDocument{%
  \immediate\write18{shasum \jobname.tex | awk '{print $1}'>  \jobname.hash}
}

\cfoot{\input{\jobname.hash}}

\begin{document}
\blindtext[5]
\end{document}

enter image description here

A variation with reading to a macro:

\documentclass{article}

\usepackage{blindtext}
\usepackage{fancyhdr}
\pagestyle{fancy}
\fancyhf{}

\newread\hashfile

\AtBeginDocument{%
  \immediate\write18{shasum \jobname.tex | awk '{print $1}'>  \jobname.hash}
  \openin\hashfile=\jobname.hash
  \read\hashfile to \filehash
  \closein\hashfile
}

\cfoot{\filehash}

\begin{document}

\blindtext[5]
\end{document}
  • This is only needed with Knuth's TeX; see the answer by Joseph. – Martin Schröder Sep 06 '16 at 20:39
  • @MartinSchröder: And that's why it deserves a downvote? That's rather harsh. It is an alternative, not more and not less. If I would downvote any 'unnecessary' alternative by users I would have a long list to pursue! The time of the downvote and the comment is very clear ... –  Sep 06 '16 at 20:51
  • I think you could avoid the temporary .hash file by using the “piped input” feature. – GuM Sep 06 '16 at 23:51