47

Are there some limitations regarding the pdflatex command with respect to the count of pages of the input files?

My highest page count, when compiling, was 90 without any troubles. (2008)

Currently another document will be prepared which will be approx. 170 pages in sum.

EDIT: Thanks to all answers. They were helpful.

mnemonic
  • 1,313

4 Answers4

100

pdfTeX

The pages are stored in a page structure, a balanced tree. The top node contains an integer with the number of pages. The maximal integer number in PDF and TeX is 231−1 (2,147,483,647). However indirect objects are needed to store pages in the PDF format (PDF specification):

  • 1 Page node per page;
  • 1 Resources object per page (pdfTeX generates the object for each page, even if the resource objects are the same and could be shared);
  • 1 Contents object per page (in theory equal pages could share the same object, but it is not supported by pdfTeX and unlikely for real documents).
  • Overhead of the page tree structure with additional kid nodes;
  • And the document has a few additional objects (e.g. Catalog, Info).

Thus more than 3 indirect objects are needed per page. But the number of indirect objects (indirect objects are PDF objects that can be referenced by the object number and are recorded in the cross reference section) is limited: 223−1 (8,388,607).

The following test file explores the maximal number of pages with pdfTeX. It does only generates minimal pages without fonts, annotations. The pages are completely empty (\shipout\hbox{}).

% pdftex --ini test.tex
\catcode`\{=1
\catcode`\}=2
\pdfoutput=1
\iffalse
  \pdfobjcompresslevel=2
  \pdfcompresslevel=9
\else
  \pdfobjcompresslevel=0
  \pdfcompresslevel=0
\fi
\pdfminorversion=5
\countdef\pageno=0
\chardef\one=1
\countdef\max=255
\max=2621437
\def\x{%
  \advance\pageno\one
  \shipout\hbox{}%
  \ifnum\max=\pageno
    \let\x\relax
  \fi
  \x
}
\x
\end

Tested with pdfTeX 3.1415925-2.4-1.40.13 (TeX Live 2012). Result:

Pages: 2,621,437
File size: 862,082,448 bytes
PDF without object stream compression

If the page number is increased by one, then pdfTeX complains with an error message:

! TeX capacity exceeded, sorry [indirect objects table size=8388607].

PDF object stream compression of PDF-1.5 decreases the file size, but costs indirect objects for storing the object streams. That decreases the maximal number of pages. For testing, replace \iffalse by \iftrue in the example above and play with the setting for \max. Result:

Pages: 2,603,538
File size: 329,412,496
PDF with object stream compression

In practice especially annotations costs objects (and therefore pages), whereas fonts can be reused throughout the document.

Summary

The theoretical maximal number of pages for PDF files with pdfTeX is 2,621,437 (empty pages and without object stream compression of PDF-1.5).

LuaTeX 0.95.0

LuaTeX derived from pdfTeX, thus the empty page uses the same number of objects as in the case of pdfTeX. The page tree is stored with less number of objects, because pdfTeX uses 6 kids in a page tree node, but LuaTeX has increased the number to 10 kids per page tree node.

Therefore, LuaTeX can create a little more pages. The hard limitation is the number of indirect objects in the PDF file (223-1).

Test file for LuaTeX 0.95.0 with option --ini:

\catcode`\{=1
\catcode`\}=2
\directlua{tex.enableprimitives("",{"outputmode"})}
\outputmode=1
\directlua{
  \iffalse
     pdf.setobjcompresslevel(2)
     pdf.setcompresslevel(9)
  \else
    pdf.setobjcompresslevel(0)
    pdf.setcompresslevel(0)
  \fi
  pdf.setminorversion(5)
}
\countdef\pageno=0
\chardef\one=1
\countdef\max=255
\max=2696336
\def\x{%
  \advance\pageno\one
  \shipout\hbox{}%
  \ifnum\max=\pageno
    \let\x\relax
  \fi
  \x
}
\x
\end

Result:

Pages: 2,696,336
File size: 821,034,398 bytes
PDF without object stream compression

Summary

LuaTeX can generate a PDF document with 2,696,336 empty pages.

XeTeX

XeTeX uses the program xdvipdfmx as output driver for PDF. It generates the PDF file with empty pages in a similar way as pdfTeX or LuaTeX. However, it uses more nodes (indirect objects) for the page tree than the other TeX compilers. The LuaTeX run with 2,400,000 pages took nearly a minute, but the XeTeX run about 18 minutes.

Test file:

\catcode`\{=1
\catcode`\}=2
\countdef\pageno=0
\chardef\one=1
\countdef\max=255
\max=2400000
\def\x{%
  \advance\pageno\one
  \shipout\hbox{}%
  \ifnum\max=\pageno
    \let\x\relax
  \fi
  \x
}
\x
\end

Command call with options to turn compression off:

xetex -ini -output-driver="xdvipdfmx -C 0x0040 -z 0" test.tex

2,400,001 pages generate the error message:

xdvipdfmx:fatal: Page number 2400002l too large!

File size with 2,400,000 empty pages: 901,335,935 bytes

Version of XeTeX is 3.14159265-2.6-0.99996 (TeX Live 2016) and the version of xdvipdfmx is 20160307.

Summary

XeTeX can generate a PDF document with 2,400,000 empty pages, but it is much slower than pdfTeX or LuaTeX.

Heiko Oberdiek
  • 271,626
  • 19
    Heiko, you are incredible. I thought the question here would not to lead to anything usefull, but I was wrong. – Keks Dose Feb 06 '13 at 21:39
  • 5
    I compiled Song That Never Ends with 2.500.000 pages with no problem. – Paul Gaborit Feb 06 '13 at 22:30
  • 2
    Which PDF reader is recommended for viewing such a large document? – Ari Brodsky Feb 07 '13 at 08:21
  • how long did compiling that document take? – long tom Feb 12 '13 at 19:10
  • 2
    @longtom ≈ 40 sec (without compression) up to ≈ 3½ min (with full compression), in both cases with --interaction=batchmode (console output takes another 30-40 sec). However this is more a test for the hard drive (look at the file sizes) and the Flate compression algorithm than for pdfTeX, the TeX part is quite small. – Heiko Oberdiek Feb 12 '13 at 22:13
  • 3
    @AriBrodsky Acrobat Reader. Because of its support for linearized PDF it is able to view the page without reading the whole file/objects/pages. I have not found any other PDF viewer (Foxit Reader, PDF-XChange Viewer, SumatraPDF, evince, epdfview, okular) that is capable to show at least the first page in reasonable time. – Heiko Oberdiek Feb 13 '13 at 02:37
  • Hmm, with pdflatex, TeX capacity seems to last some longer, and I get further to 2.632.407 pages. See http://tex.stackexchange.com/questions/95783/typesetting-the-entire-song-that-never-ends#comment231978_95788. – math Apr 02 '13 at 08:01
  • 3
    @math But you get ! ==> Fatal error occured, no output PDF file produced!. In especially you do not have a PDF file as result. Writing the internal structure, a tree, for the pages requires additional objects, but in your example the objects are already exhausted. – Heiko Oberdiek Apr 02 '13 at 11:11
  • Adobe Reader takes forever to open even a regularly sized PDF for me. In fact, it is so slow that it took me a long time to realise that it was not freezing when I opened a file - it was just mediating. Presumably it works better on other platforms or nobody would use it. – cfr Jul 16 '15 at 00:38
  • Could this limit be raised with Luatex, XeTeX or ConteXt? – skan Dec 16 '16 at 12:01
  • 1
    @skan I have added the analysis for LuaTeX (without compression). ConTeXt as macro package is not relevant here, because higher level macros cannot change, what the low level commands are capable of. – Heiko Oberdiek Dec 16 '16 at 14:25
  • Now, the missing analysis for XeTeX is added. – Heiko Oberdiek Dec 18 '16 at 11:37
39

TeX is designed to have very limited memory requirements (basically once a page is shipped out it is gone) I had people generating longtable documents with 10s of thousands of pages back in the 1990's so on modern machines I don't think 170 pages is going to stress the system too much.

What is more important than page count is page complexity: if you have a high resolution plot done in tikz or picture mode or some such then that ends up being an awful lot of boxes on the same page. That's why sometimes it helps to generate such things as external graphics to be included. Unless they are changing on each run that will typically speed things up even if calculating them inline doesn't exceed memory requirements.

David Carlisle
  • 757,742
23

Other than architectural limits of PDF (the number of indirect objects in a PDF is limited to 8Mi as per ISO-32000:2008, and numbers are limited to 2Gi), there are no known limits. If memory allows and your disc is large enough, pdfTeX should be able to generate PDF up to some TiBytes. It will fail eventually—the offset size in the compressed xref table is currently limited to 240(5 bytes)—but extending that limit is easy.

14

A few years ago I compiled a Beamer document with 10'000 pages and 30'000 formulas without any issues. A colleague's PhD thesis compiled nicely as well: 1'500 pages, 1'300 of them full-pages images with 5--10 lines of caption each, resulting in a 4.5 GB PDF file.

So within thinkable limits there is no limit.

However you may encounter limits in the number of concurrent auxiliary files. Creating hundreds of different indices or listoffloats may result in issues with those limits. IIRC there was a number of 17 concurrent file streams TeX may be able to write to.

Uwe Ziegenhagen
  • 13,168
  • 5
  • 53
  • 93