1

I am trying to convert my document into ePUB using tex4ebook. In this bilingual writing, the free font Kalpurush is used.

Here's the MWE file.tex:

% !TEX program = xelatex
% !BIB program = biblatex
\documentclass[12pt, twoside]{book}
% For a bilingual document    
\usepackage[banglamainfont=Kalpurush, banglattfont=Kalpurush]{latexbangla}                               
%activate polyglossia
\setdefaultlanguage[numerals=Bengali, changecounternumbering=true]{bengali}
%number all levels
\setcounter{secnumdepth}{5}
  \setotherlanguage{english}
\usepackage[autostyle]{csquotes}
% \usepackage[backend=biber, sorting=none, language=english, autolang=other, block=ragged]{biblatex}
% \addbibresource{bookbib.bib}

\usepackage{lipsum} \usepackage{enumitem} \setlist[itemize]{label*={\fontfamily{lmr}\selectfont\textbullet}} \begin{document} \tableofcontents

\chapter{First Chapter} \section*{First Section} পিথাগোরাস (Pythagoras)-এর উপপাদ্যটি হল,\ ``সমকোণী ত্রিভুজের অতিভুজের উপর অঙ্কিত বর্গক্ষেত্রের ক্ষেত্রফল অপর দুই বাহুর উপর অঙ্কিত বর্গক্ষেত্রের ক্ষেত্রফলের সমষ্টির সমান।" \ অর্থাৎ কোন সমকোণী ত্রিভুজের অতিভুজ $c$ এবং অপর দুই বাহু $a$ এবং $b$ হলে, [c^2=a^2+b^2]

\begin{itemize} \item The individual entries are indicated with a black dot, a so-called bullet. \item The text in the entries may be of any length. \end{itemize}

% \nocite{*} % adds all entries in the bib file to the bibliography % https://tex.stackexchange.com/a/13513/114006 % \printbibliography

\end{document}

By using the command:

tex4ebook -x -f epub3 file.tex mathml

I am getting the following the errors:

[STATUS]  tex4ebook: Conversion started                                                                 
[STATUS]  tex4ebook: Input file: testing_bangla.tex                                                     
[WARNING] tocid: char-def module not found                                                              
[WARNING] tocid: cannot fix section id's                                                                
This is XeTeX, Version 3.141592653-2.6-0.999994 (MiKTeX 22.3) (preloaded format=xelatex.fmt)            
 restricted \write18 enabled.                                                                           
entering extended mode     

But no ePUB file is generated. How can I solve it? If a config file is needed, what should be included there?

raf
  • 633

1 Answers1

2

The problem is that Bengali numbers are written to the auxilary file that TeX4ht uses for storing cross references. When the cross-references are loaded, these numbers cause compilation error, as they are active characters.

As a workaround, turn off Bengali numbering with TeX4ebook:

% !TEX program = xelatex
% !BIB program = biblatex
\documentclass[12pt, twoside]{book}
% For a bilingual document    
\usepackage[banglamainfont=Kalpurush, banglattfont=Kalpurush]{latexbangla}                               
%activate polyglossia
\ifdefined\HCode
\setdefaultlanguage[numerals=Bengali, changecounternumbering=false]{bengali}
\else
\setdefaultlanguage[numerals=Bengali, changecounternumbering=true]{bengali}
\fi
%number all levels
\setcounter{secnumdepth}{5}
  \setotherlanguage{english}
\usepackage[autostyle]{csquotes}
% \usepackage[backend=biber, sorting=none, language=english, autolang=other, block=ragged]{biblatex}
% \addbibresource{bookbib.bib}

\usepackage{lipsum} \usepackage{enumitem} \setlist[itemize]{label*={\fontfamily{lmr}\selectfont\textbullet}} \begin{document} \tableofcontents

\chapter{First Chapter} \section*{First Section} পিথাগোরাস (Pythagoras)-এর উপপাদ্যটি হল,\ ``সমকোণী ত্রিভুজের অতিভুজের উপর অঙ্কিত বর্গক্ষেত্রের ক্ষেত্রফল অপর দুই বাহুর উপর অঙ্কিত বর্গক্ষেত্রের ক্ষেত্রফলের সমষ্টির সমান।" \ অর্থাৎ কোন সমকোণী ত্রিভুজের অতিভুজ $c$ এবং অপর দুই বাহু $a$ এবং $b$ হলে, [c^2=a^2+b^2]

\begin{itemize} \item The individual entries are indicated with a black dot, a so-called bullet. \item The text in the entries may be of any length. \end{itemize}

% \nocite{*} % adds all entries in the bib file to the bibliography % https://tex.stackexchange.com/a/13513/114006 % \printbibliography

\end{document}

Note this code that checks for TeX4ht:

\ifdefined\HCode
\setdefaultlanguage[numerals=Bengali, changecounternumbering=false]{bengali}
\else
\setdefaultlanguage[numerals=Bengali, changecounternumbering=true]{bengali}
\fi

Sometimes it is easiest to include packages or their options conditionally.

You can still can get the numbers in chapters and in TOC, thanks to a build file (build.lua):

local domfilter = require "make4ht-domfilter"
local filter    = require "make4ht-filter"
local domobject = require "luaxml-domobject"

-- we will calculate unicode character from this local bengali_zero = 0x09E6 - 48 local uchar = utf8.char local ubyte = utf8.codepoint

-- convert arabic number to bengali local function arabic_to_bengali(text) return text:gsub("([0-9])", function(a) return uchar(ubyte(a) + bengali_zero) end) end

local function process_children(head) for _, child in ipairs(head._children) do if child:is_text() then child._text = arabic_to_bengali(child._text) end end end

local process = domfilter { function(dom) -- process section numbers for _, head in ipairs(dom:query_selector(".titlemark")) do process_children(head) end -- process TOC for _, toc in ipairs(dom:query_selector(".tableofcontents span,nav#toc li")) do process_children(toc) end return dom end }

-- we must fix also the ncx file, which is used for Epub TOC -- we must clean it first, in order to be able to process it using LuaXML local ncx_process = filter { function(text) local text = text:gsub("^%s*", "") -- remove whitespace at the beginning local dom = domobject.parse(text) -- convert text to DOM for _, mark in ipairs(dom:query_selector("navmark")) do -- process elements that can contain numbers process_children(mark) end return dom:serialize() end }

Make:match("html$", process) Make:match("ncx", ncx_process)

It uses LuaXML to process the generated HTML files, and to replace Arabic to Bengali numbers in sections and in TOC.

You can execute it in this way:

tex4ebook -f epub3 -x -e build.lua file.tex "mathml"

This is the result:

enter image description here

michal.h21
  • 50,697
  • How much time does it take to convert? I am getting the same execution message that I mentioned in the question. Nothing is appearing after entering extended mode in my terminal. – raf Jun 21 '22 at 01:22
  • 1
    @raf it should take just few seconds. You can try the '-a debug' option to see the full log, it probably goes to an infinite loop somewhere. Which TeX distribution do you use? – michal.h21 Jun 21 '22 at 06:00
  • I am using MiKTeX 22.3 – raf Jun 21 '22 at 06:08
  • Thank you. Using -a debug has helped. Which epub viewer are you using? I just opened the epub with Calibre E-book viewer. The Bangla texts are appearing broken! – raf Jun 21 '22 at 06:20
  • @raf MikTeX can have older version of TeX4ht files, so it is possible that they contain some issues that were already fixed. How did you fix your issues? – michal.h21 Jun 21 '22 at 09:41
  • @raf I use Calibre too. How are the texts broken? Are they broken even in the screenshot in my answer, or just on your system? – michal.h21 Jun 21 '22 at 09:42