4

This question now has my own answers that I redacted in accordance to work instructions. I would have tried to redact my own answers again, but my workplace has decided that my answers do indeed fall under public domain since my answers are composed of open-source or public material. (I just can't give anymore answers from now on.)

I'll leave this question here, but I must let the site administrators know that this question isn't useful. It was just a test (like a few other questions) to assess the barriers-to-entry to LaTeX et al. It would be best to just delete this question.

A fuller answer (I posted) on doing CJK is at a very tangentially related question.

To Davislor who diligently answered nigh every possible facet of my question: Thank you! I am going to keep this question here to showcase your diligence and resourcefulness. Do note in my answer (in another question) how I avoided potentially problematic automatic font lookup via font name; in certain scenarios, that lookup can actually loop badly as to break TeX processing. The demographics of LaTeX newbies turned away tend overwhelmingly towards software engineers, and they are very wary of "automagic" features that "provide no consistently reliable function, and yet no way to fully control".


MWE:

\documentclass{scrbook}

\usepackage[utf8]{inputenc} % For non-English languages

% Install cjk for Chinese, zhmetrics for font size.
\usepackage{CJKutf8}

% OT1 for Chinese. T1 for English.
\usepackage[OT1, T1]{fontenc} % T1 will be active encoding.

\begin{document}

This is mainly an English document.

\begin{CJK}{UTF8}{song}
With a smattering of Chinese: {\fontencoding{OT1}\selectfont 中文}
\end{CJK}

\end{document}

The error:

!pdfTeX error: pdflatex (file cyberb65): Font cyberb65 at 657 not found

I would at least like to know which fonts come default with which packages. In the MWE, the packages I installed are:

  • koma-script
  • inputenc
  • cjk
  • fontenc

For some unknown reason, I already have access to the gbsn font. Am on MacOS.

Side question: Is it difficult to transition to an engine that better supports UTF-8 and non-English languages?


Working answer: Don't use song font family. Use gbsn. If you wanna know how/where that font got installed by default, private message Jon who wrote this question.

And thanks to @cfr, I'm not using OT1 font encoding anymore. Not needed with gbsn font family.

Updated MWE:

\documentclass{scrbook}

\usepackage[utf8]{inputenc} % For non-English languages.

% Install babel-vietnamese, vntex for Vietnamese.
\usepackage[vietnamese, english]{babel} % English will be active language.

% Install cjk for Chinese, zhmetrics for font size.
\usepackage{CJKutf8}

% T5 for Vietnamese.
\usepackage[T5, T1]{fontenc} % T1 will be active encoding.

\begin{document}

This is mainly an English document.

\begin{CJK}{UTF8}{gbsn}
With a smattering of Chinese: 中文
\end{CJK}

And some Vietnamese: {\fontencoding{T5}\selectfont Tiếng Việt}

\end{document}
  • 1
    OT1 for Chinese? I don't think so. OT1 is an older Latin encoding, which does not support accented characters in the way T1 does. Not too bad for English, but pretty crap for other European languages and certainly hopeless for Chinese. – cfr Jun 11 '18 at 00:22
  • The error means that you do not have the font installed. However, this may not be true given that you've switched to OT1 which the font probably doesn't support. But the CJK fonts are not installed as part of a standard installation of TeX Live because they take a great deal of space and are not much used, even by people who need to typeset CJK. Is there some reason you want to use pdfTeX? These days, XeTeX would be the obvious solution. – cfr Jun 11 '18 at 00:24
  • Definitely \fontencoding{OT1}\selectfont is wrong. Whatever else is or isn't right or wrong about your code, that part surely needs to go. – cfr Jun 11 '18 at 00:27
  • @cfr Which encoding should I use for Chinese? I'm using pdfTeX because that's what online tutorials seem to start with, but will surely transition to XeTeX now. Is XeTeX better and more modern that pdfTeX these days? How should I select an encoding if not by \fontencoding? –  Jun 11 '18 at 01:02
  • I don't know how the Chinese encodings work. Look at the documentation for the CJK packages you're using. I don't use XeTeX myself, but I certainly would if I needed to typeset Chinese, Japanese, Hebrew, Arabic or any other non-Latin script. It isn't wrong to use \fontencoding. The problem is the encoding you're using. That said, you should rarely, if ever, need to use \fontencoding directly in a document. – cfr Jun 11 '18 at 01:04
  • Take a look at the doc for xecjk. It doesn't mean much to me, but it presumably will to you. – cfr Jun 11 '18 at 01:21
  • @cfr If I want a mainly English document, but want some Chinese/Japanese/etc in there too, should I not use \fontenconding? Or should I use something else? –  Jun 11 '18 at 03:32
  • It isn't \fontencoding that's the problem per se. Rather, OT1 is the problem. That said, if you use XeTeX, you won't have to worry about font encodings at all. – cfr Jun 11 '18 at 04:04
  • @cfr I decided not to give up microtypography in pdfTeX. The \fontencoding can be wrapped in macros. The song font family required OT1 encoding, so I don't need it anymore now that I use the default gbsn. By the way, have a look at the updated MWE that includes Vietnamese. –  Jun 11 '18 at 14:36
  • 1
    I didn't say you couldn't use \fontencoding. I said that OT1 was not the right encoding for what you wanted. OT1 is a 7-bit Latin encoding, good for English and used for text in mathematics. Also, you're not using microtypography, as far as I can tell. But perhaps you didn't mean that literally. – cfr Jun 11 '18 at 23:52
  • @JonWong You probably never want to use \fontencoding in the body of your document. For one thing, all your new documents should use Unicode, not obsolete 7- or 8-bit encodings. You’d use a package such as Polyglossia or Babel that provide higher-level commands to change the language, or else write your own that change the font, the script and the text direction as needed. – Davislor Jan 16 '19 at 15:15
  • @Davislor Yeah, I agree. TexLive recently went all unicode by default. I'll try to update this question with another question (regarding \fontencoding). –  Jan 17 '19 at 11:30
  • 1
    If you want a minimal LaTeX installation, you should use MikTeX instead of TeX Live. It comes with a very basic install of only a few hundred megabytes and installs packages on-the-fly during a LaTeX run if missing dependencies are detected. – Henri Menke Jan 22 '19 at 01:13
  • @HenriMenke Yeah, I guess you're right. But I have this OCD where I try to consolidate all development efforts. I won't use MikTeX if TeX Live (basic scheme) already serves the same purpose (yes, without auto-install of packages on-the-fly). Also, it's good to keep a clean list of dependencies, rather than let MikTeX install dependencies with possibly unintended consequences. Already, at work, we have over 10 different types of TeX installations, and I'm being pressed to shut down the entire TeX initiative altogether. –  Jan 22 '19 at 01:17
  • @JonWong There's a dedicated site for discussions about this TeX site, that is: https://tex.meta.stackexchange.com/ Your opinion is welcome, so please move your section "Difficult Climate for Newbies?" there and remove it from the main TeX site that's for TeX related stuff only. On that meta site you can get feedback too. Thanks! – Stefan Kottwitz Jan 22 '19 at 18:37
  • @StefanKottwitz Responded to you here. –  Jan 22 '19 at 22:18
  • 1
    If you’re missing a file you need, such as a font, I’d try A: searching for it on CTAN, and installing the package that contains it. B: On Debian/Ubuntu, which has its own texlive package, run apt-file search to find out which OS packages contain the font. For example, the uming font needed by ctex is in the Ubuntu package fonts-arphic-uming:. – Davislor Jan 31 '19 at 02:28
  • @Davislor It's much easier to download the whole CTAN, write simple scripts to scan CTAN files, and surface the required fonts. (We wrote a mirror directly into tlmgr; and also created MikTeX-like functionality in it) We did this CTAN lookup in-house, but it's really very simple scripts. If you're a lecturer in a varsity, I would advise you to just pose a question to your 1st year computer science students. A massive majority of my know-how I need to withhold from this community (work instructions) isn't even at all specialized. –  Jan 31 '19 at 02:34
  • 1
    @JonWong That works. Although, if you do that, you already have everything that’s in CTAN, so what package would install it no longer matters to you. – Davislor Jan 31 '19 at 04:04

1 Answers1

4

You ask about transitioning to an engine with better support for more languages. I highly recommend it. At the moment, there are a few workarounds you need to apply, but this works in XeLaTeX or LuaLaTeX:

\documentclass[varwidth=10cm, preview]{standalone}
\usepackage{fontspec}
\usepackage[english]{babel}

% Babel 3.22 erroneously passes the wrong OpenType script and language options
% to fontspec.  This workaround overrides the bug:
\babelprovide[script=CJK, language={Chinese Simplified}]{chinese-simplified}

\defaultfontfeatures{ Scale = MatchUppercase, Ligatures = TeX }

\babelfont{rm}[Scale = 1.0, Ligatures = Common]{Noto Serif}
\babelfont{sf}{Noto Sans}
\babelfont[chinese-simplified]{rm}[Ligatures = Common]{Noto Serif CJK SC}
\babelfont[chinese-simplified]{sf}[Ligatures = Common]{Noto Sans CJK SC}
% Also set the monospace font and load unicode-math if you need math.

\begin{document}

This is mainly an English document.

With a smattering of Chinese: \foreignlanguage{chinese-simplified}{中文}

\end{document}

Noto font sample

I used the Noto fonts and Noto CJK, but any Unicode font with support for Simplified Chinese should work.

You would use \foreignlanguage{chinese-simplified}{...} for short snippets of Chinese within a paragraph, and \begin{otherlanguage}{chinese-simplified} ... \end{otherlanguage} for long passages in Chinese. See the Babel manual for more details.

I used the Noto fonts in this example, but you can change them. With fontspec, any font from your system or word processor will work, and in this template, they will automatically scale to the height of the main font.

As of January 2019, this is trickier than it should be because polyglossia does not support Chinese, but babel is broken out of the box on several languages I’ve tried, including Chinese, Japanese and Hebrew. However, there was a simple one-line workaround for the bug with Chinese.

Installing Missing Components

You requested a follow-up about, I think, locating missing fonts.

On most TeX Live or MikTeX installations, if you’re missing a file, such as a package or font, I would search CTAN.org for it, find out which package contains it, and install that package. Make sure you select the option to search file names!

This search found that cyberb65.tfm is in the zhmetrics package, which is installable in either TeX Live or MikTeX.

If you are using the Debian/Ubuntu installation of TeX Live, it is possible to create a texmf directory and do a local installation from CTAN, but you should first search to see if there is a deb package for the file you want, either online or with apt-file search. For example, searching for uming.ttc tells you that the uming font that ctex requires is in the package fonts-arphic-uming.

If you’re missing a class, search for foo.cls; a package, foo.sty; or a type 1 font, foo.tfm. For example, if you were trying to compile a document that used \documentclass[UTF8]{ctexart}, and it were not installed, you would get an error about a missing class ctexart, search for the file ctexart.cls, and find it in texlive-lang-chinese. If you were missing the cjk package, you would search for cjk.sty.

Davislor
  • 44,045
  • Is babel the best package to use for multilingual LaTeX? If I were to try to contribute to a package dedicated to multilingual LaTeX, would it be babel? Should I use XeLaTeX or LuaLaTeX? I primarily use pdfTeX, so am worried about https://tex.stackexchange.com/a/128543/152148 –  Jan 21 '19 at 04:46
  • @JonWong Your options are babel and polyglossia. Only babel runs on PDFTeX and there are many languages it supports and polyglossia doesn’t. It’s also part of the basic LaTeX distribution. On the other hand, a few of its language definitions are broken out of the box, including Japanese and Chinese. Those can be patched fairly simply, but Hebrew cannot. – Davislor Jan 21 '19 at 05:12
  • Unfortunately, I still can't run your MWE, possibly because I installed a trim version of TeX Live. Pls see my update in my question regarding "The Proper Standard in MWE" (my proper standard for my infantile grasp of LaTeX, not yours!). –  Jan 22 '19 at 00:57
  • 1
    @JonWong, this is an excellent MWE in this answer. It does only use an absolute minimum of packages that would come with even the smallest LaTeX installation. You could slightly simplify it by using \documentclass{article} instead of \documentclass[varwidth=10cm, preview]{standalone}. It might be failing for you because you do not have the Noto family of fonts installed. But you could use any font that you have that supports Chinese. The other possible problem is that you have the Noto fonts installed but in ttc format and you are compiling with lualatex, which has trouble with ttc. – David Purton Jan 22 '19 at 01:22
  • @DavidPurton Thanks! Yes, I only do my MWEs with standalone and varwidth=10cm so that they’ll have the right width when I rasterize them and post them here. People should replace that with the document class of their real paper anyway. – Davislor Jan 22 '19 at 02:44
  • @DavidPurton The Noto CJK fonts are here: https://github.com/googlei18n/noto-cjk The .otf versions ought to work well for you, but .ttc works for me too. – Davislor Jan 22 '19 at 02:46
  • @Davislor, I have trouble with my machine and lualatex. Bizarrely, luaotfload-tool --find="Noto Serif CJK SC" finds the font, but fontspec complains: ! Package fontspec Error: The font "NotoSerifCJKSC" cannot be found. I don't know why this is. :(. Maybe I'll ask a question. Don't like things not working. – David Purton Jan 22 '19 at 02:59
  • @DavidPurton Not sure why that is. You could try giving the file name of the OTF version. And thanks for the reminder: I added links to the fonts I used. – Davislor Jan 22 '19 at 03:01
  • @DavidPurton "But you could use any font that you have that supports Chinese". I installed TeX Live basic scheme, and start from there. It's difficult for me to assume that I could serendipitously somehow have Chinese fonts residing on my system. I'm trying to learn TeX, which is hard enough, and would hope to put off learning which Chinese fonts I can/should grab from the world at large. –  Jan 22 '19 at 05:28
  • 1
    @JonWong Okay. If you’re writing documents in Chinese, you probably have at least one Chinese font installed that you can use in your word processor? That should work for you. – Davislor Jan 22 '19 at 05:33
  • @Davislor I have 4 actually, and they all came from cjk package. Good guess, though. Actually, I'm trying to put together a single document that proves that a LaTeX document can house all the scripts (unicode at least) the world uses commonly. That was the thing that is needed to convince my workplace to use LaTeX. –  Jan 22 '19 at 05:35
  • 1
    @JonWong Well, I added links to the fonts I used in that document. You should be able to click on the .otf files to install them. Noto does cover almost all the world’s scripts. – Davislor Jan 22 '19 at 05:37
  • @Davislor Oh! Ok. I guess I should get used to OpenType fonts being the norm, rather than hope so fervently (unrealistically) that TeX Live should contain some. It's not like cjk package is created/maintained by the TeX Live team, I think. –  Jan 22 '19 at 05:40
  • My TeX Live installation does come with some of the Noto fonts in its texmf-dist/fonts/opentype/google/noto subdirectory. I’m not sure which CJK fonts are installed on your machine, but you can check. – Davislor Jan 22 '19 at 05:45
  • 2
    There’s a CTAN package for it, and also for Arphic, the Fandol family and perhaps some others. You should be able to install them with tlmgr. – Davislor Jan 22 '19 at 05:49
  • @DavidPurton "I have trouble with my machine and lualatex. Bizarrely, luaotfload-tool --find="Noto Serif CJK SC" finds the font, but fontspec complains: ! Package fontspec Error: The font "NotoSerifCJKSC" cannot be found. I don't know why this is. :(. Maybe I'll ask a question. Don't like things not working." Well, we have this answer now. But I'm not allowed to post it here (or anywhere online). :/ I still maintain that open-source is the way to go, despite the substantial noise-to-signal ratio. –  Jan 22 '19 at 06:05
  • 1
    @JonWong, xetex and luatex are designed to use system fonts (though they can make use of fonts installed in the texmf tree too). So usually you need to install fonts in the normal way for your operating system. This is often especially true for non-Latin Scripts. The full Noto font download comes in at a hefty 1.1GB! This would substantially increase the size of your TeXLive distribution :). – David Purton Jan 22 '19 at 07:05
  • @DavidPurton Add a gigabyte to the TeX Live distribution? At this point, who’d even notice? – Davislor Jan 22 '19 at 07:12
  • @DavidPurton Can't believe I never heard of Noto (yes, u can, u should :-P). That's 1.1GB from Google's servers, not CTAN (full 5.7GB install takes days). Hard disk space isn't that premium now, but time is (download time). @DavidPurton 1.0GB added to TeX Live wouldn't be noticeable, esp with basic scheme TeX Live install. Love it. –  Jan 22 '19 at 07:23
  • @Davislor Davislor: I'm about to delete this question (or at least completely redact). I'm afraid that will mean deleting your answer. Would you rather I keep your answer by redacting my question instead? In any case, you answer isn't complete, and needs to include a baseline sure-fire option with font filenames. –  Jan 31 '19 at 01:30
  • 1
    @JonWong My answer was meant to be supplemental, since it’s more, “You mentioned the possibility of switching to XeTeX instead, so that’s what I recommend.” I can expand on it. – Davislor Jan 31 '19 at 02:16
  • @JonWong I don’t think an edit to the question is the right way to carry on a conversation with me or harangue the community into changing its ways. If that’s how you feel, it might be better to delete the question. – Davislor Jan 31 '19 at 02:18
  • @Davislor It wasn't my intention to carry a conversation in the question. It was meant to get as prompt a response from you as possible, so that I can do what I'm instructed at work --- to delete my contributions to this community. –  Jan 31 '19 at 02:22
  • @JonWong I added a section on locating missing fonts and other files. – Davislor Jan 31 '19 at 02:42
  • @JonWong In addition to the other reasons not to edit your question that way, I don’t get notified when you edit one of your questions. I did see when you replied to my answer and pinged me. – Davislor Jan 31 '19 at 02:43
  • @Davislor Yes, that's why I wrote the comment first, then edited the question in case you found it hard to search the comments. –  Jan 31 '19 at 02:54
  • @Davislor There's a very bad (for lack of a better adjective, my fault) mixture of advanced and beginner info in the "missing fonts" section. It would be easier to write at a single level first (eg. intermediate or beginner, depending on audience), and elaborate thereafter. Try to stick to 1 platform initially, easier on you the writer; a lecturer can't teach with more than 1 mouth at a time. :-) (I would discourage apt-get packages, as per my upvote of samcarter's answer.) –  Jan 31 '19 at 02:59
  • @Davislor I've attached a fuller answer. See my additions to the question. Are you from any field anywhere close to computer science? By the way, it's best to load fontspec after babel. –  Feb 21 '19 at 03:24
  • @JonWong Yes, I have a CS degree. Why do you ask? – Davislor Feb 21 '19 at 04:28