8

I would like to address an issue here that matters a lot to me that is: Why do the arXiv admins not consider to introduce an option to submit XeLaTeX generated PDFs? Or in a broader sense: non pdflatex generated documents.

A reasonable argument I can think of is that they might be afraid of problems arising from licensing issues (concerning the fonts used). But that is a problem that could be solved.

I noticed that the support team puts an enormous effort in detecting if a PDF document was typeset with any TeX typesetting engines and eventually blocking it from being uploaded, also the sources as long as it is not compatible with the (pdf)latex compiler, although people (especially linguists see below) are asking for this for years now. So why not allowing researchers to upload their PDF?

There are a vast number of significant advantages using not the old-fashioned (pdf)LaTeX typesetting engines:

  1. most notable to me: a proper unicode support, i.e. proper UTF-8 support out of the box
  2. OpenType, TrueType font support (Multiple Master Fonts and other modern font technologies like Graphite and AAT) via the fontspec package
  3. ligatures and contextual alternates
  4. in addition to (1): Proper support for Japanese, Chinese, Korean etc.
  5. polyglossia let’s us effortlessly change between different languages, even variants of them, within the document.

The first five are mainly typesetting arguments, so they are also arguments for researchers, namely linguists, that often have to readapt their documents to XeLaTeX for publishing, but, on the other hand, are compelled to submit their work to the heavily used platform to be perceived by a broader audience. I really dislike to say, but I have the slight impression that historically TeX was written from a US centred perspective: Writing some Eastern European or Turkish names correctly can be quite challenging, e.g. try Nikodyḿ, Žižek or Aydoğmuş, Nallıhan.

But also for me, coming from a mathematical background, there are huge advantages, namely:

  1. The unicode-math package:

    • Improved readability options, especially in math mode: Using \symnormal, \symliteral, \symup, \symbfup, \symbfit etc. I can easily manipulate a greek letter to be displayed bold, upshape, sans serif bold italic or whatever I like. Very helpful to distinguish, e.g., an arbitrary δ from the codifferential. It’s a pain in the ass (please excuse my language here) to achieve the same output with pdflatex. E.g. for the famous indicator function, you can just type “$\symbb 1$”.
  2. Out of the box support for the vast majority of symbols used in present mathematics, here is complete a list.

    • For example, including \Vbar for independence in probability theory, \smalltriangleright to denote a left action, the musical isomorphisms \flat and \sharp, \subsetcirc for open subset, all dice faces, etc. etc.
  3. More math fonts and options how to use them available: different fonts for different symbols can be chosen, the size of single objects can be set independently etc.

Some completely unnecessary crooks often help to somehow implement all the features mentioned in 1.-8. The question for me is: Why? It is like using obsolete/outdated version of programming language, say C++, because you do not "like" Lambda Expressions.

  1. I will not go to hard on this, because I am not an expert, but for many people the possibilities of LuaLaTeX should be also mentioned.

A claim often made is that publishers also use their own template, so people will have to adapt eventually. But (1) why shall I not be allowed to format my work as it fits my needs and aesthetic perception? Especially considering that not using arXiv is not an option anymore. (2) Then you are still left with some disciplines, as I mentioned linguists or philologist, that rely on a proper unicode support.

Edit: Tried to clear up that I am mainly concerned with the generated XeLaTeX PDF files not being able to upload the sources.

wueb
  • 277
  • Isn't this completely off-topic? – Johannes_B May 04 '20 at 14:12
  • 5
    I'm not sure we can answer this, as it's not really a question. I would, though, point to the 'OpenType fonts' as a reason they might not want XeTeX files. TeX fonts are restrictive but reliable: they are the same on every system if the font versions match. Try getting the same stability from operating system fonts ... – Joseph Wright May 04 '20 at 14:14
  • @JosephWright Thank you very much for your answer. Problem is that people may pledge for an exception, but - concerning the power the platform meanwhile has - it is not transparent how the decision is taken. They do not offer a forum or mailing list, so I tried to reach a bigger audience here, I must admit. – wueb May 04 '20 at 14:56
  • 1
    @DavidCarlisle Thank you for your comment. I realise it sounds harsher, then intended. But, as already mentioned, people wonder and ask for this feature for years now. You can find many thread concerning workarounds also on this platform. So it is not as if this problem would be "hot and new" or negligible... Further, it is not just my opinion, it is a feature researchers (namely linguists) rely on, what you would have noticed reading my explanations completely. – wueb May 04 '20 at 14:58
  • 2
    While this is a valid question, it is not on-topic for this site. As @JosephWright says, the most likely reason is potential problems with fonts -- not only system fonts, but also proprietary/commercial fonts, which are sometimes required for lack of availability of desired symbols in free fonts. – barbara beeton May 04 '20 at 15:08
  • @wueb None of these things matter for scientific papers. -1 – Henri Menke May 05 '20 at 08:32
  • 2
    @HenriMenke (1) Items 6.-8. do matter. (2) Linguistics don't do science? Good to know. – wueb May 05 '20 at 09:31
  • 2
    shouldn't that be "significant advantages"? And unrelated: babel works with lualatex and xelatex too. polyglossia is not required with this engines. – Ulrike Fischer May 05 '20 at 09:51
  • Thank you very much, it really was not my intention! – wueb May 05 '20 at 09:56

1 Answers1

22

I had the same questions a few years ago. However, instead of writing a ranty post and calling the arXiv maintainers names on some site that they will never read, I decided to reach out and offer my help. Here I quote part of their reply:

While there has been a significant number of requests for LuaTeX (and XeTeX), arXiv's infrastructure requires significant testing and development work as part of any new feature/tex processing system.

While it's likely that at some point we'll offer this as part of our compilation service, the timeline is unclear. I have forwarded your request to the development managers for eventual scheduling and deployment. If they are interested in deeper collaboration, they may reach out individually. Due to limited developer time, we cannot provide any timeline nor status updates for enhancements.

So calling them “relentlessly ignorant” in this regard is not just rude, it is also utterly wrong.

Furthermore, the next time you decide to write a rant you should properly check your facts, because all the arguments you brought forward are not valid arguments for the introduction of Lua/XeTeX.

Nobody cares about your lovely ligatures, special symbols, fancy fonts, or magical markup. And while it's true that arXiv accepts submissions in languages other than English, nobody realistically does that.

The only argument for the introduction of LuaTeX (to a lesser extent XeTeX) is accessibility. Proper tagging of document elements is only really possible by inspecting the node list in Lua. There is also one advantage to OpenType fonts, which is that the text in the PDF will be properly mapped to Unicode, which is also of paramount importance for accessibility.

Henri Menke
  • 109,596
  • 1
    Thank you for feedback to my question. (1) If you would have followed the discussion above, you would have read that I already apologised that sounded way harsher than I intended, and it mainly refers to the third paragraph, i.e. they put enormous effort in forbidding people to upload PDF documents generated by any TeX compiler (you can follow the numerous threads here on stackexchange) instead of letting people just upload the compiled PDF if they are in need of using XeLaTeX. – wueb May 05 '20 at 09:37
  • 1
    (2) Please check items 6.-8., these are more than "valid arguments”. The only argument, as you call it, is a repetition of what I already wrote in item 1. and 2. (3) Please, don’t confuse yourself with "nobody", i.e. everybody else. (4) Read the discussion in the link after item 5. in the list. Linguists don’t do research? – wueb May 05 '20 at 09:38
  • @wueb UTF-8 input is completely irrelevant for Unicode mapping in the output (see glyphtounicode.tex), although it makes it easier for the implementer of a TeX engine. As far as I can see linguistics has no dedicated category on arXiv. – Henri Menke May 05 '20 at 10:09
  • I made a minor edit above trying to emphasise that I meant to being able to upload generated PDF, not being able to compile the sources via arXiv. – wueb May 05 '20 at 10:18
  • @wueb This is a solved problem. See https://www.monperrus.net/martin/how-to-use-lualatex-arxiv and https://wiki.contextgarden.net/Posting_on_arxiv.org – Henri Menke May 05 '20 at 10:35
  • 5
    While your points regarding the ranty style of the non-question are well-placed, the fact that submissions are mostly done in English does not preclude the possibility that the authors of said submissions would appreciate having their names rendered correctly and that that may include all kinds of diacritics. – 0xC0000022L May 05 '20 at 10:38
  • 4
    @HenriMenke Tricking arXiv into accepting the PDF without sources may be a solution, but it's not a good solution. – Faheem Mitha May 05 '20 at 14:04
  • @0xC0000022L You can always compile the author names to small pdf files and include them in the submission. While this is not a too convenient solution, it is a workaround. For instance, with Feynman diagrams one often has to do the same. Personally I think this is one of the cases where there is some entity, the arXiv in this case, which provides us with something great. And people, instead of thanking them for what they got, only focus on a little additional feature that they didn't get. –  May 05 '20 at 15:24
  • 2
    @Schrödinger'scat you're right. And yes, this would be a viable workaround, I suppose. I did not and will not join into the rant. I was just trying to point out an - to me as a passersby - obvious weakness in the points Henri made. – 0xC0000022L May 05 '20 at 21:14
  • It's been another four years. – Michael Chao Jan 01 '24 at 05:30