13

I am discovering LuaLaTeX...

I want to produce pdf files from my tex sources as small as possible.

If I compile the following code with LuaLaTeX, the size of the generated pdf is 9 kB.

\documentclass{article}
\usepackage{titlesec}
\usepackage{titling}
\usepackage{fontspec}
% Specify different font for section headings
\newfontfamily\headingfont[]{Gill Sans}
\titleformat*{\section}{\LARGE\headingfont}
\titleformat*{\subsection}{\Large\headingfont}
\titleformat*{\subsubsection}{\large\headingfont}
\renewcommand{\maketitlehooka}{\headingfont}
\author{An author}
\title{The title of the article}
\date{\today}
\begin{document}
\maketitle
\section{A section}
\subsection{A subsection}
\subsubsection{A subsubsection}
\end{document}

If I compile the same code with LaTeX but with a difference of font, the resulted files is 49 kB large. I guess the difference doesn't come from LaTeX ou LuaLaTeX but from the fonts. Something like : "my pdf system knows better about the Gill Sans font than the default LaTeX fonts".

\documentclass{article}
\usepackage{titlesec}
\usepackage{titling}
%\usepackage{fontspec}
%% Specify different font for section headings
%\newfontfamily\headingfont[]{Gill Sans}
%\titleformat*{\section}{\LARGE\headingfont}
%\titleformat*{\subsection}{\Large\headingfont}
%\titleformat*{\subsubsection}{\large\headingfont}
%\renewcommand{\maketitlehooka}{\headingfont}
\author{An author}
\title{The title of the article}
\date{\today}
\begin{document}
\maketitle
\section{A section}
\subsection{A subsection}
\subsubsection{A subsubsection}
\end{document}

Can someone confirm this? Is there a rule to learn from this if one wants to produce very light files?


PS : If I use the same font, I get the same size !

PPS : Strangely, if I apply ghostscript on the pdf file generated by LuaLaTex, the size gets bigger (from 9 kB to 12 kB !). The file generated by LaTex : from 49 kB to 13 kB.

PPPS : In my naive way of understanding things, using a custom font, such as Gill Sans would normally produce a bigger file than if using the standard font.

Colas
  • 6,772
  • 4
  • 46
  • 96
  • 2
    wouldn't you learn more about pdfTeX vs LuaLaTeX if you used the same fonts in both documents? PS: I'm getting 103 vs 12 KB when using lmodern in both. – Nils L Oct 11 '13 at 20:07
  • 3
    Just as a side note: while "kilooctets" is a more precise term than "kilobytes", I don't know of any non-francophone region where the abbreviation "ko" is even recognized by a significant amount of people. So I would discourage you from using it in English texts. – Christian Oct 11 '13 at 20:15
  • 2
    PPS: there's several tools you can use to gain insight into the 'weight' distribution of a PDF. This is what Acrobat has to say about the two examples. – Nils L Oct 11 '13 at 20:15
  • @Christian Why is it more precise ? – Colas Oct 11 '13 at 20:19
  • 5
    @Colas: no matter what font you're using, it will always be embedded in the pdf. This is part of what the pdf format is all about: making a document display identically everywhere. For the software displaying the pdf, there is no such distinction as 'default vs. custom font'. (For TeX, by the way, there isn't either: there's no "knowing better about Gill Sans than about..."). What matters is the absolute size of a font -- which in turn is determined by various factors: size of character set, font format, number of points in the outlines, etc. – Nils L Oct 11 '13 at 20:43
  • Thanks to @Christian for the comment about ko. I just thought that was a typo. Now it's nice to know where that comes from. – A.Ellett Oct 11 '13 at 20:47
  • @NilsL Are you sure (http://tex.stackexchange.com/a/2207/8323) ? – Colas Oct 11 '13 at 20:53
  • @Colas: yes, I'm sure -- all your fonts will always be embedded, unless you explicitly disable disable embedding. Why would any sane wo/man do that? I have no idea. But let's get back to your original question. Didn't the the two threads I referred to get you further? To be honest, I think your Q could be considered a duplicate of those. – Nils L Oct 11 '13 at 21:08
  • 2
    @Colas Because a byte isn't necessarily 8 bits (aka an octet). In the early days of computers, bytes came in all kinds of sizes. These days, 8 bits is the de-facto standard of course but still, technically speaking, octet is more precise than byte. – Christian Oct 11 '13 at 22:11
  • You should use the same font, compress the resulting PDFs with pdfsizeopt and then compare them. Some TeX-implementations compress fonts while others don't. Same for subsetting (including only used glyphs). Pdfsizeopt takes care for that.

    In a second step you can extract the embedded ressources with mutool from the sumatrapdf-package (mutool.exe extract).

    In a last step I would use mutool to convert a binary PDF to an ASCII PDF to have a deeper look inside (mutool.exe clean -d).

    Can you post two PDF-files for those who do not have PDFLaTeX and LuaLaTeX available right now?

    – Hanseat Oct 15 '13 at 11:57

1 Answers1

7

I'm using Linux Libertine heavily and each PDF, made with pdfLaTeX, had a size of some hundred KB. pdfTeX (as far as I know called by pdfLaTeX) embeds the fonts in the PDF, but it does not compress them.

There was a wonderful tool called pdfsizeopt, written by Peter Szabo and published at Google Code. It compressed the fonts inside the PDF heavily; mine usually by factor 10!

Well, sadly enough, the page http://pdfsizeopt.googlecode.com/ was taken down for alleged copyright issues. New home is here: https://github.com/pts/pdfsizeopt (Thank you, giordano). There you will find a paper Szabo published about the compression going down to a very detailed level.

In short: The difference between LuaLaTeX and PDFLaTeX seems to be the way of embedding the fonts, namely the used compression.

So as a result to those batshit crazy laws you'd better not publish you code with googlecode.

Keks Dose
  • 30,892
  • New home of the pdfsizeopt project: https://github.com/pts/pdfsizeopt – giordano Oct 15 '13 at 11:43
  • @giordano This is strange: Google search does not deliver the new address. Thank you! – Keks Dose Oct 15 '13 at 11:50
  • I didn't find the Windows Binary online but I have got a version from 2012-06-27 on my computer. I am not sure if I am allowed to share it, but here are some hashes of it to compare with files you find online. Name: pdfsizeopt_win32bin.zip Size: 18.025.567 bytes CRC32: 1156BE71 MD5: 9c3dc2197089dee5895ac8c594a92627 SHA1: 40a7bd6a6e08b5f80a4058d270096bd0492b14e7 – Hanseat Oct 15 '13 at 12:18