U+2234 not set up for use with LaTeX

Question

I am using Doxygen to document my C code project. My C source files are saved with UTF-8 encoding. Within some of the files I have mathematical symbols, such as this line:

∴ ∀ FOO ∈ ℕ ≤ BAR

The symbols are copy-pasted from the fileformat.info website so are definitely the correct UTF-8 characters. My doxygen build uses a config file (encoded in UTF-8) that tells it to produce UTF-8 encoded latex output. It also instructs it to add amsmath and amssymb

The Doxygen build runs without errors or warnings

Yet when I attempt to build the latex it fails for:

("C:\Program Files\MiKTeX 2.9\tex\latex\amsfonts\umsb.fd") [1{C:/Users/Toby/App Data/Local/MiKTeX/2.9/pdftex/config/pdftex.map}] [2] [1] [2] Chapter 1. (group__pmb.tex

! Package inputenc Error: Unicode char Ôê┤ (U+2234) (inputenc) not set up for use with LaTeX.

See the inputenc package documentation for explanation. Type H for immediate help. ...

l.12 ...+E+L+O+W+E+R+B+I+TS))\mbox{]} Ôê┤ ÔêÇ A+D+D+R+_++N+...

?

It seems to error on the first symbol (∴) that it encounters.

I'm not a TeX person, I just want to document my C program well (which worked on my last PC, of course running older versions of all software involved). What more can I do to get it to understand the symbol characters?

I am using the latest version of MiKTeX (64-bit) and ghostscript (32-bit)

I hope those unicode characters are not outside of comments or removed preprocessing tokens because the behaviour will be undefined and your compiler may order you pizza, or possibly format your hard drive. — cat, Dec 05 '16 at 18:05
Doxygen produces latex output and build files, I've tried adjusting to XeLaTeX but it doesn't really work. — Toby, Dec 06 '16 at 09:28

egreg · Accepted Answer · 2016-12-05T16:16:11.833

7

You should add

\DeclareUnicodeCharacter{2234}{\therefore}

to your document preamble. How to do it for Doxygen I don't know. You need also \usepackage{amssymb}.

You can somewhat automate the correspondence between Unicode point and command name with something like

\documentclass{article}
\usepackage[utf8]{inputenc}
\usepackage{amsmath,amssymb}

\makeatletter
\newcommand\UnicodeMathSymbol[4]{%
  \ifnum#1>"FF
    \expandafter\DeclareUnicodeCharacter\expandafter{\@gobble#1}{#2}%
  \fi
}
\makeatother
\input{unicode-math-table}

\begin{document}

$∴$

\end{document}

based on the assumption that the command name offered by unicode-math-table is the same as the amssymb name.

edited Dec 05 '16 at 16:16

answered Dec 05 '16 at 15:37

egreg

1,121,712

Maybe also \ensuremath? Not sure if Doxygen would be smart enough to put the symbols in math mode. – Willie Wong Dec 05 '16 at 15:40
Ah I meant I have amssymb, not asmsymb, d'oh. Will I need to do this for every UTF-8 symbol that I use? surely if LaTex supports UTF-8 symbols then it should recognise them without this for each symbol? – Toby Dec 05 '16 at 15:41
1

@WillieWong The error message seems to be in a math formula; I wouldn't use \ensuremath anyway, because the symbol is math and should stay in math. – egreg Dec 05 '16 at 15:42
3

@Toby no, the utf8 support means that it will decode the utf8 encoding and know that you want U+2234, it does not define all the thousands of characters that could be accessed by number and allocate a font for each one. but an alternative would be to use xelatex and unicode-math which uses opentype math fonts which do have a large range of math characters in a single font. – David Carlisle Dec 05 '16 at 15:47
@DavidCarlisle OK, where can I find the names to use for each character (e.g. the \therefore)? Or do they match the UTF-8 names? Bonus question: How does one know which package to use? I selected amsmath based on it's name but haven't a notion what it really provides - or any of the other packages TBH, is there a canonical listing with explanation somewhere? – Toby Dec 05 '16 at 15:53
1

you have to just know. For math characters the ams* packages almost certainy covers what you need but if it was say U+A880 I'd have no idea what font would support that, other than google for information, but perhaps you would do better with xelatex (or lualatex) and just use unicode-math` package but that's really not the subject of this question/answer – David Carlisle Dec 05 '16 at 15:57
1

@Toby: If you don't know the LaTeX commands, detexify can be handy. Alternatively, egreg's answer here can also be useful for looking things up if you know the unicode. – Willie Wong Dec 05 '16 at 16:07

U+2234 not set up for use with LaTeX

1 Answers1