31

From a considerable time, I am asking to myself how TeX language is classified. I have looked for some useful information in Internet a couple of times without success (even a tip!). In this list (here), nothing was mentioned about TeX. I don't know a language similar to TeX language for trying a comparison. All I know it's Turing Machine Complete and nothing more.

How would TeX programming language be classified (or how do you classify it)? What is its programming paradigm?

Thanks in advance.

  • 45
    TeX is using the famous, celebrated, and well-established “hack your way out of everything, possibly by exploiting innocent commands like \lowercase, until for some unknown reason it kinda compiles, most of the time” paradigm. This is another reason to support LuaTeX, which finally gives TeX a proper, predictable, consistent programming language. – Gaussler Jun 26 '21 at 15:50
  • 6
    see also the wikipedia page on m4 which has some general description of macro processing languages https://en.wikipedia.org/wiki/M4_(computer_language) – David Carlisle Jun 26 '21 at 15:51
  • 3
    @DavidCarlisle Please tell me if the following is a correct understanding: The difference between a function and a macro is that if \foo and \baz were functions, then \foo{\baz{Hello World}} would evaluate \baz{Hello World} first, then plug the result into \foo. Then \foo would only ever see the output of \baz{Hello world}, not the actual code itself. But since they are macros, the evaluation is kinda happening in the opposite order, i.e. from the outside and in. Is that right? – Gaussler Jun 26 '21 at 15:59
  • 3
    Not an answer, but a wealth of possibly relevant historical information: What research papers exist about TeX and friends? – barbara beeton Jun 26 '21 at 16:14
  • 8
    @Gaussler possibly although that's not really a useful way to understand the paradigm, the point is there is no compiler, and no "return values": a macro processor works essentially by textual replacement just replacing tokens inline with their replacement text. There are other common macro processors beside the Cpp and m4, eg the entity expansion of sgml/html/xml which replaces   by the character U+00A0 for example. – David Carlisle Jun 26 '21 at 16:21
  • Is TeX even a language, in the sense of languages like C or Fortran (to use contemporary examples)? At least at the LaTeX level, it's always seemed more like a set of markup commands. – jamesqf Jun 26 '21 at 23:17
  • 1
    @ Daniel can I ask why exactly you are interested in this? Is there a practical aspect, like conversion between languages or mixing languages together? Or are you just curious? And if so, what is the reason of your curiosity? – Marijn Jun 27 '21 at 08:49
  • 2
    @Gaussler I don't think the distinction between function and macro can be pinned down to a clear criterion on the evaluation semantics. In Lisp, macros are just functions operating on source-code AST; in assembly, macros are a built-in compile-time substitution whereas functions are basically just a design pattern; in Haskell, all functions are by default evaluated outside-in (with lazy evaluation)... – leftaroundabout Jun 27 '21 at 12:24
  • 3
    @jamesqf there is a markup language but the system that interprets that markup and produces typeset output is written in several tens of thousands of lines of tex macro code. It certainly feels like a programming language when writing that stuff. – David Carlisle Jun 27 '21 at 16:26
  • 1
    @Gaussler I don't have any beef with LuaTeX as a piece of software, except that including the syllable "TeX" in the name is at best dishonest marketing. – alephzero Jun 28 '21 at 03:59
  • 2
    @alephzero Could you elaborate on that last point? To me, it feels very TeX-like, with some internal changes and a consistent programming language on top of it. – Gaussler Jun 28 '21 at 14:45
  • See also https://tex.stackexchange.com/questions/58501/a-critique-of-tex – lhf Jun 28 '21 at 14:47
  • for a while I thought TeX was some sort of weird dialect of lisp; now I see that the similarities go as far as to admit that they may be described as variant macro processors. – jarnosc Jun 29 '21 at 05:39

6 Answers6

41

it's a macro expansion language (the macro part of it, not the typesetter) comparable to other macro languages like the C pre-processor macros.

C preprocessor

#ifndef ZZZ
#define ABC 3
#else
#define ABC 4
#endif

TeX

\ifx\zzz\undefined
\def\abc{3}
\else
\def\abc{4}
\fi
David Carlisle
  • 757,742
  • 3
    There’s \advance though, which IMHO leaves the boundaries of “macro expansion language” quite clearly. It’s still a preprocessor-like language, but more than mere macro expansion. – mirabilos Jun 27 '21 at 17:30
  • 4
    @mirabilos yes the macros have to have something to expand to, there are typesetting primitives \hbox and a few arithmetic primitives and text but the control constructs and programming logic are all on the macro side so I think the paradigm for tex as a language is clearly macro processing. – David Carlisle Jun 27 '21 at 17:36
  • 1
    Agreed, “macro processing” it is then, but more than just expansion. – mirabilos Jun 27 '21 at 17:38
  • would the "typesetter" part be a page description language (DVI, PDF or thereabouts)? – jarnosc Jun 28 '21 at 04:07
  • as for a comparable (tiny) macro processor with the ability to process different types of syntax, for which TeX is [in]famous, you may have a look at gpp – jarnosc Jun 28 '21 at 04:11
  • 2
    The example you've chosen is not representative and implies a greater similarity than warranted. Yes, both are macro expansion languages but they operate quite differently. – Konrad Rudolph Jun 28 '21 at 10:00
  • 4
    @KonradRudolph well true btu I was taking a rather high level view aiming basically at "tex macro processing os more like the C pre-proceessor than it is like C" not looking to distinguish different macro processors or different compiled languages. – David Carlisle Jun 28 '21 at 13:25
25

TeX is a macro expansion language. Macros are replaced at point of use by their definition. Ultimately, this process will result in literal text (typeset), an expandable primitive (expands to its outcome), or a non-expandable primitive (is executed).

Joseph Wright
  • 259,911
  • 34
  • 706
  • 1,036
22

TeX has two programming systems, the "mouth" (which does macro expansion essentially) and the "stomach" (which typesets and does assignments). They run only loosely synchronised and on-demand.

For programming purposes, they are a pairing of a blind and a lame system since the "stomach" is not able to make decisions based on the value of variables (conditionals only exist in the "mouth") and the "mouth" is not able to affect the value of variables and other state.

While eTeX has added a bit of arithmetic facilities that can be operated in the mouth, as originally designed the mouth does not do arithmetic. There is a fishy hack for doing a given run-time specified number of iterations in the mouth that relies on the semantics of \romannumeral which converts, say, 11000 into mmmmmmmmmmm.

Because of the synchronisation issues of mouth and stomach, there is considerable incentive to get some tasks done mouth-only. Due to the mouth being lame and suffering from dyscalculia, this is somewhat akin to programming a Turing machine in lambda calculus.

TLDR: the programming paradigm of the TeX language is awful.

  • 11
    Finally someone said it. – Gaussler Jun 28 '21 at 09:58
  • 1
    @Gaussler I've often said that programming TeX is akin to playing chess in that the most direct route rarely works the way you want it to. I've written some impressive TeX code in my day, but I think even DEK, if he were to do it over again, wouldn't create the TeX macro language the way he did. MF is in many ways TeX macro language 2.0 in that it eliminates many of the pain points of TeX programming. – Don Hosek Jun 28 '21 at 14:35
  • @DonHosek Sorry for asking stupidly, but, what do you mean by MF? – Gaussler Jun 28 '21 at 14:37
  • 2
    @Gaussler Sorry, Metafont. The macro language here has a lot of similarities to TeX, but has the advantage built in of not having to deal with intermingled text output. – Don Hosek Jun 28 '21 at 14:39
  • @DonHosek Makes sense. I guess that, in a way, a reformulation of the problem is that TeX tries to be HTML+CSS+JavaScript (the “stomach”) and PHP (the “mouth”) at the same time, and that is doomed to fail. It would be better if TeX clearly separated these layers. – Gaussler Jun 28 '21 at 14:43
  • 1
    Everything worth using is some kind of awful. – hobbs Jun 29 '21 at 17:39
9

In typical programming languages (like functional C) the source of the program is a set of commands but there is nothing between such commands. If you need to print something, you have to use a command with parameter text, print() for example.

The TeX source is primarily text, which must be printed, and there are control sequences mixed with this text. This is very different concept from typical programming languages. In TeX, you have control sequences (typically macros) and text between these sequences on input side and these sequences are processed (together with the text between them) to another internal mix of the text plus primitive control sequences. This internal text is printed with control of these primitive control sequences.

wipet
  • 74,238
  • 4
    This seems like a semantic quibble. There is no logical difference between "printf("%s", <10,000 characters of text>):" and "\begin(document) <10,000 characters of text> "\end{document}". The only difference is (trivially different) syntax. – alephzero Jun 28 '21 at 04:04
  • 2
    @alephzero Agree. If the entire C program was inside a printf, that should make it more TeX-like, wouldn't it? TeX has an input stream, which is a mixture of printable material and instructions that resolve into, or affect, printable material. I don't know enough C: can %s take (escaped?) commands that affect the running of the program? – Cicada Jun 28 '21 at 06:17
  • 6
    "The TeX source is primarily text, which must be printed, and there are control sequences mixed with this text. This is very different concept from typical programming languages." cough PHP cough... – Heinzi Jun 28 '21 at 13:23
  • 2
    @Heinzi PHP has explicit demarcation of the boundaries between code and text though. In fact, one might consider ?><?php to be shortcut for print('} – Don Hosek Jun 28 '21 at 14:36
  • 3
    @alephzero You can't realize all your program in the argument of "printf()" in C, you do it outside this "printf()". The TeX input is mixture of text and control sequences (e.g. macros) , which realize the program. For example, macros can manipulate with surrounded text and generate another text. In C, you can insert spaces and new lines arbitrary between commands, but this is not possible in TeX, because these spaces "between commands" can be printed. You must always take into account that the input-output stream is a part of your programming code. – wipet Jun 28 '21 at 17:54
  • 1
    @Heinzi You have boundaries of PHP chunks, they are not interacting with other text. Don Hosek mentioned it. Moreover, in each PHP chunk you have sequence of commands similar as in C, and if you need to sent something to the output, you must use echo which is alternative to printf in C. I don't see any similarity of PHP with TeX. – wipet Jun 29 '21 at 06:00
5

Paradigm is "substitute" - either substitute nothing (let the raw text go through), or substitute something (affect the output in some way).

Combining all the answers and comments:

(a) TeX is a markup, and so therefore a subset of SGML.

(b) It is an implicit loop ("read until the end of the file") and an implicit command ("print"); everything else happens inside that.

(c) Its print-control ability is very similar to DCF, and its job-control ability is similar to JCL.

(d) It can be mapped to a combination of html, css, javascript, a file i/o method, and command line/batch.

(e) Its macro-expansion ability is akin to the SAS macro processor: SAS macros produce and control SAS code; TeX macros produce and control print output; both mix macros and non-macro material; and both expand their macros until either primitives or code/text is reached.

(f) In a spreadsheet, if I do a formula, '=if(a1="",substitute(a2,"#1",b2),"")' - if cell a1 is empty, replace the characters '#1' in the content of cell a2 with whatever the content is in cell b2, otherwise do nothing" - is TeX technique.

(g) It is self-defining and extensible and can interface with other input/output, which implies there is no linguistic 'border'.

Packages and commands not only can add extra functionality, they can do so by re-defining the meaning of existing packages and commands, including themselves.

(h) In turn, this implies that TeX is more a 'structure' or a system rather than a syntactic 'language', and indeed an instruction "\X" (or "qΨ") in one document may or may not be incompatible with instructions {or their grammar) in another document. Moreover, grammaticality and syntactic correctness can be re-defined (or even un-defined).

(i) This define-ability implies that each document is in effect its own 'language'.

Perhaps what is happening is that the presence of raw text is easily classified as 'not part of the language' and focus is given to explicit commands because they are visibly marked with an escape character. But the implicit commands (loop, print), which are analagous to -0 case endings in declensions in linguistics, are also part of the picture. Nothing will 'happen' if there is nothing to print.

The explicit commands have only one purpose: to modify how and when the implicit commands do their job, or do their job on. That is another way of saying "markup".

(j) Therefore the original description, "TeX is a document preparation system", is still most apt.

(k) TeX is a superset of language.

Cicada
  • 10,129
  • 2
    Aside from the fact that SGML came years after TeX, I'm not entirely sure that one can express TeX as SGML (yes I know angle-bracket notation is configurable). AFAIK no one has ever created a DTD that would allow an SGML parser to parse a TeX document nor has anyone claimed that such a thing would be possible. – Don Hosek Jun 28 '21 at 14:32
  • 1
    @DonHosek Yes, actual ML/DTDs took decades to crystalize formally; I was thinking of the concept of markup structure in DCF on mainframe and the & macro names for DDs in the JCL jobdeck. The idea of some sort of meta command to switch to bold, like \b, took years to turn into the realization that there's a common structure to language-description languages. It's been interesting watching that journey. I've always ended up needing to refer to the names of names of names of things (e.g., flexible programs that can "unfold" gracefully and deal with new cases on the fly). TeX has that. – Cicada Jun 28 '21 at 15:06
4

TeX is not a programming language.

First and foremost, TeX is a typesetting system. It contains various components, i.e. the bits which know how to put letters on a page of paper; the parts which know about kerning; higher level stuff like being able to express paragraphs, pages, structures like tables of contents and all of that.

TeX has a part which you may, if you were so inclined, call the "TeX macro language" for a very loose interpretation of the term "language". It certainly is not a general purpose programming language. While there may be hacks to make it such - I assume TeX is turing complete or can be made so with very small extensions - I would not classify it in any of the broad categories of programming languages or paradigms.

In modern terms, you might call it

  • macro language (because you can define substitution macros inside it)
  • markup language (because, like, say Markdown, you can express structure on top of the text)
  • domain specific language (DSL), although this term is usually used if you are in the context of a general purpose language, and then there is some domain specific syntax created on top of that to make it look like something completely different

In contrast, a programming paradigm is something like "object oriented", "imperative", "functional", "logical", "declarative" and so on which specifically describes the inner workings and thought processes behind a general programming language, which TeX certainly is not.

AnoE
  • 149
  • 2
    What you express doesn't makes sense. First, in the question it's clear there reference to TeX language (not to TeX typesystem). Second and briefly, in the The Tex Book: "I wish to thank the hundreds of people who have helped me to formulate this “definitive edition” of the TEX language, based on their experiences with preliminary versions of the system". Third, you really can programming in TeX. Even it not being a general propose (or being a DSL), it's Tunning Complete: this means you can process input data and produce any output that a C compiled program can do (e.g, you can create virus) – Daniel Bandeira Jun 30 '21 at 03:45