61

What are the typesetting limitations of LaTeX? In other words what are the things that desktop publishing and word processing programs do better than LaTeX? I am not really interested in things that are better left to the editor (spell check, grammar check) or some post processing program (track changes, word count), but rather typesetting type issues. For example, LaTeX and I really struggle with suppressing page breaks and changing paragraph size. I think even MS Word can do these things better (although it is obviously much easier when you are ignoring the badness of the paragraph). Are there other things? Is there a list or a reference somewhere?

EDIT In regards to the comment by alfC, in the bigger picture the question is what functionality is currently missing from LaTeX (and packages) that exists in other programs. Things like rivers are still difficult from a conceptual standpoint, while resizing a paragraph and preventing page breaks are easy to conceptualize (but difficult to solve).

StrongBad
  • 20,495
  • 3
    Maybe not a proper answer and somehow pedantic, but TeX is a Turing complete language (http://en.wikipedia.org/wiki/Turing_completeness), so any well defined "computable" typesetting algorithm is in principle doable within TeX/LaTeX. How difficult can it be to implement or if it is already available in the form of a package is a different question. Also it does not apply to heuristic/vague typesetting requirements, that in effect may be hidden in professional commercial software packages, but if they were known it can be implemented. – alfC Sep 10 '12 at 08:23
  • @alfC that is exactly what I am looking for: well defined typesetting problems that have not been implemented yet because they are difficult. – StrongBad Sep 10 '12 at 08:42
  • 1
    I see. Just out of curiosity, What difficult problems you know that are solved in other word processing programs? Are rivers solved? (in Word?). – alfC Sep 10 '12 at 08:47
  • 6
    You might want to watch Frank Mittelbach's talk at TUG2012: he covers things that are hard or not possible at the engine level. – Joseph Wright Sep 10 '12 at 09:17
  • 2
    Somewhat related to http://tex.stackexchange.com/questions/58501/a-critique-of-tex – lhf Sep 10 '12 at 11:25
  • I feel your pain with the line breaks but I’m not convinced that this is actually easier in Word. I remember many a frustrating hour spent in Word trying to convince it to break pages in the way I wanted. – Konrad Rudolph Sep 10 '12 at 14:01
  • 1
    See also http://tex.stackexchange.com/questions/27440/what-cant-tex-do – Lev Bishop Sep 10 '12 at 14:02
  • 4
    @alfC, at the risk of being overly pedantic: As I've pointed out before, TeX being Turing complete doesn't mean that TeX can solve any typesetting task. One major problem is that it has hardly any information about the distribution of "ink" in the bounding boxes of characters. For example in this question of mine I don't see how Turing completeness would help. – Hendrik Vogt Sep 12 '12 at 07:50
  • @HendrikVogt, A great moment in pedantry, I would say: if everything else fails... TeX can be implemented in TeX, and in that implementation you can possibly add any feature you want, including the awareness of the ink distribution. So Turing does help to solve the theoretical question, but I agree it cannot solve a particular practical problem. The point is that the original is a tricky question, it is like asking what are the limitations of a random computer language. And yes, there are limitations to Turing machines, for example TeX cannot solve the so called "halting problem". – alfC Sep 12 '12 at 11:38
  • @alfC: I certainly did not allude to the halting problem. But you're right in a sense, if you have full file access, then in principle it's possible to write an algorithm that determines the ink distribution from the font files. – Hendrik Vogt Sep 12 '12 at 12:23
  • The main problem with (La)TeX is that it's too willing to stretch. See http://www.tug.org/TUGboat/tb28-1/tb88bazargan.pdf for a possible solution. (Their solution didn't work for me but it's a start.) –  Sep 29 '12 at 05:08
  • remarking that TeX is Turing-complete is like replying, to your average person in Chile looking for a way to travel to Canada, that it's possible to go on foot. yes, it's indeed (physically) possible, but that's completely irrelevant to the traveler, and adding irrelevant information is often misleading... – Daniel Diniz Aug 17 '22 at 17:46

5 Answers5

53

The biggest limitations I can think of (strictly in comparison with programs like InDesign or Word) are as follows.

Mind you, I have no idea what consequences any changes would have wrt. computational complexity or output quality of the system as a whole. TeX does the things it does extremely fast and in near optimum quality, so it may well be in the "pareto set", so to say.

But anyway...

Paragraph formatting and column/page breaking are strictly detached from each other.

Consequently, TeX's paragraph optimization algorithm doesn't "know" where a certain line will be positioned on the page and can't take this into account.

This has some dire consequences. Off the top of my head:

  1. It's awfully hard to make text "flow around" things on the page, especially when page breaks or flush bottom typesetting are involved.
  2. I can avoid breaking the page at a line containing a hyphen or a widow/orphan, but I can't make a penalty for paragraph breaking to avoid it appearing on this line.

TeX is missing a lot of "meta information" on the page

Things like color, z coordinate, angle, writing direction etc. are somehow 'fiddled' into the boxes making up the page or are just implicit in the order content is output. Most of the info can't be inspected later, and communication between content items ("does this box collide with the other one when it's turned around?") becomes next to impossible.

TeX has a "waterfall" model of page building

Whenever content has left one part of TeXs digestive system for another one, there mostly is no way back. The best we can achieve is to undo everything until a certain stage (for instance, by throwing a box away instead of outputting it) and retrying with different parameters.

If TeX had an object oriented page model where every information could be freely inspected, modified or restructured at any stage and where "typesetting" and "page building" mainly meant to re-structure objects and enrich them with meta information ("where has this paragraph been broken, how much has glue been stretched") which can later be inspected or modified, things would be much much easier.

  • 6
    It should be possible to implement TeX in TeX. – morbusg Sep 10 '12 at 13:50
  • 1
    @morbusg, reimplementing tex in any language is pretty painful, see for example http://en.wikipedia.org/wiki/New_Typesetting_System for how difficult it is, even in a "nice" language like java. Implementing TeX in TeX, while possible in principle, is.... unlikely. – Lev Bishop Sep 10 '12 at 14:07
  • @morbusg Of course, but wrt. the "meta-TeX" the basic TeX will be just another implementation language; you could as well use Java ot Python. If you want to enrich the typesetting capabilities in any way, then the fact that you're using TeX as the basis for re-implementation won't help you much. – Stephan Lehmke Sep 10 '12 at 15:51
  • @morbusg: ConTeXt does this partly by implementing alternative algorithms in Lua. But this is obviously slower. – Martin Schröder Sep 11 '12 at 18:46
  • Regarding paragraph breaking at line n: Have you tried widowpenalties? – Martin Schröder Sep 11 '12 at 18:58
  • @MartinSchröder Was your comment directed at me? If so, I'm not getting the reference. My point was, it's hard to reflow the paragraph to avoid page/column breaking troubles. – Stephan Lehmke Sep 11 '12 at 20:59
  • 3
    Stephan, your object oriented page model might be a start but it wouldn't help much as there aren't proper algorithms that do the logic in more than very simple cases. This is where the research community has failed in the last 20 years. – Frank Mittelbach Sep 12 '12 at 22:02
  • 2
    @FrankMittelbach That's why I released myself of algorithmic questions right in the beginning ;-) But even just applying all the "old" algorithms that we have in a clear and open way to a transparent object-oriented data structure would be an enormous win... – Stephan Lehmke Sep 12 '12 at 22:12
  • @FrankMittelbach Doesn't LuaTeX offer such an object-oriented model, at least to some extend? – Gaussler Jun 07 '15 at 17:10
43

A comparison of TeX's capabilities that can be used reasonably efficiently (ignoring the Turing argument as it doesn't really help much) with high-quality craft typography has been discussed by me in the article E-TeX: Guidelines to future TeX extensions. I recently reevaluated the state of affairs at the Boston TeX conference. The final paper is not yet finished (hope it will be in the next TUGboat but a video of the talk is at LaTeX project website). Both talks discuss the TeX capabilities and limitations based on the fact that TeX is a programs that renders its output as a "composer" using algorithms.

TeX is not a graphical system where the composer is essentially sitting in front of the screen. So a comparison between TeX and say MS-Word is a bit missleading as essentially TeX formats do not attempt to cater for this kind of interface (though it would be more or less possible by dropping most of the composing functionality and leave that to the users). But if you are interested in the typography limitations then the above article(s) might be a good start --- and none of these limitations are resolved in other typesetting or deskop publishing systems (with a few exceptions due to internals of the box/glue/penalty model of TeX, e.g., changing parshapes based on position on the page is very difficult in TeX but less so in other systems that either work visually or do not care about the quality of linebreaks and thus can do the par shaping at a different stage).

Update

As of March 2013 the TUGboat paper on E-TeX: Guidelines to Future TeX extension -- revisited is now also available on the project web site.

17

My answer is far from complete, but I thought I'd rather share my thoughts. Off hand I can think of these (later I'll edit my answer, if something new pops up).

  1. Limitations of the size of paper you can use and some memory limitations you may run into when doing computationally heavy TikZ (see for instance this example from TeXamples). Most of the time this can be overcome by optimizing your code, externalization of certain parts or (as a last resort) modifying the memory limits available to your TeX engine. Still, it's highly unlikely to run into such limitations. TeX is really very well engineered.

  2. There is one thing I miss: 'endless' paper length document class. :) (See my earlier question on this.) As you can see, workarounds exist in this case, too. (A limit of 10 m exists though, which should not be a problem for mundane applications.)

  3. I've never seen any sort of code folding in the PDF output file (would most likely require JavaScript).

  4. EDIT: MS Word has some 'advanced' grammar checking, e.g. it marks words that appear twice next to each other, marks senteces with unusual structure, etc. While this is probably something to be rather implemented in an editor, I don't know of any package that could throw a warning if it finds one of these problems.

Count Zero
  • 17,424
  • 2
    Code folding would be so cool! But I doubt it would be possible even with javascript, as you would need to recalculate pagebreaks, probably floats, etc. Maybe it would be possible to hide complete pages... – Juri Robl Sep 10 '12 at 09:20
  • If I am not mistaken preview package allows to have endless paper length. 3. Not sure if this qualifies as a 'typesetting' limitation (but definitely a limitation currently, that would be cool).
  • – alfC Sep 10 '12 at 09:20
  • "It's highly unlikely to run into such limitations": that is not my experience. If you start using pgfplots, many reasonably-sized graphs go out of memory. – Federico Poloni Sep 10 '12 at 09:44
  • @FedericoPoloni: There are workarounds. The pgfplots manual has a section (6.2 Memory Limitations) dedicated to solving this problem. I have to admit I only used pgfplots for (less than) reasonably sized plots, but as I said, there are workarounds. (I can't vouch for the effectiveness of these methods, as I've never had to try them.) – Count Zero Sep 10 '12 at 10:00
  • Although this is the kind of features that Daniel was trying to exclude in the question, I think it is possible to implement a package for spell or grammar checking, one that marks (red) errors in the output (e.g. if in draft mode for example), that can be helpful (also it can be inefficient --spellcheck each time-- and inflexible --how can one add custom dictionary words? well maybe it can be implemented using ispell or other as backend--). [EDIT: this seems to be a partial solution http://tex.stackexchange.com/questions/42843/is-there-a-spell-check-package-for-latex for the spelling part]
  • – alfC Sep 10 '12 at 10:02
  • TexStudio can also do these checks, I think it uses LanguageTool for it in the background. At least it doesn't like it if I use the same words near to each other, or some "bad words".
  • – Juri Robl Sep 10 '12 at 10:09