3

I kept thinking about using Cyrillic characters in command names with PDFTeX engine. It looks to me so far that if I stay with 8-bit encoding and assign category code 11 to all Cyrillic letters except ё/Ё, this would be a working solution, except that bookmarks created with hyperref will be misencoded.

A workaround to fix PDF bookmarks created with hyperref would be to temporarily set category codes back to 13 for sectioning commands, and to not use commands with Cyrillic letters inside sectioning commands, or to use \texorpdfstring, like so:

\documentclass{article}

\usepackage[T2A]{fontenc}
\usepackage[cp1251]{inputenc}
\usepackage[russian]{babel}

\usepackage[unicode]{hyperref}

\catcode`л=11\relax
\catcode`п=11\relax
\catcode`к=11\relax

\newcommand{\лк}{«}
\newcommand{\пк}{»}

\begin{document}

{\catcode`л=13\relax
\section{Раздел «Первый»}}

{\catcode`л=13\relax
\gdef\pdfbookmarkname{Раздел «Второй»}}
\section{\texorpdfstring{Раздел \лк Второй\пк}{\pdfbookmarkname}}

\end{document}

This document, when compiled, has properly encoded bookmarks.

What would be an efficient way to set category code for the whole range of Cyrillic letters, except ё/Ё?

Alexey
  • 2,099
  • do not set category codes in that way you will break latex's encoding support completely, use UTF-8 inputencoding then you can input latin and cyrillic without changing encoding – David Carlisle Jul 07 '18 at 11:10
  • @DavidCarlisle, i cannot use Cyrillic letters in command names with UTF-8 encoding and PDFTeX engine. – Alexey Jul 07 '18 at 11:13
  • @DavidCarlisle, what will this break in addition to hyperref? – Alexey Jul 07 '18 at 11:14
  • the price you pay for an 8-bit system. If you need more why not use luatex or xetex which are set up for unicode characters. making things catcode 11 just works for some undocumented subset of T2A that happens to be the same as cp1251 and will work in some simple cases, but really breaks core latex mechanisms – David Carlisle Jul 07 '18 at 11:16
  • @DavidCarlisle, my question is specifically about PDFTeX engine, XeTeX is unavailable. I am trying to find a least intrusive solution to fix PDF bookmarks in a third-party document encoded Windows-1251 and using russlh package. – Alexey Jul 07 '18 at 11:20
  • If you really need to do this I guess you know what to do, there isn't really any special efficient thing to do just make the 128 high bit characters catcode 11 or 13, but in 99% of cases it would be better to use utf-8 and ascii command names. – David Carlisle Jul 07 '18 at 11:22
  • I am not particularly good at (La)TeX programming. – Alexey Jul 07 '18 at 11:24
  • Drop the idea. While it can be neat to define commands in your own language, the price you pay is too high. And what will you do if after an upload you get asked to deliver an utf8-encoded document as utf8 is the standard now? – Ulrike Fischer Jul 07 '18 at 12:37

0 Answers0