1

I am using LaTeX with package babel and biblatex regularly in my mostly german language based documents.

Recently, I added a similar note to all of my BibTeX entries, whenever the entry referred to an PDF documentation, that is delivered with TeXLive distribution. Translated in english, the note should read something this:

note={Part of the online documentation of TeXLive distribution, file
      \url{<filename>.pdf}},

This is the german text I use:

note={Bestandteil der Online"=Dokumentation von \TeXLive, 
      Datei \url{<filename>.pdf}},

I have added the babel shorthand "= which is enabled in german or ngerman languages. But even when I wrap the note text with an \foreignlanguage{ngerman}{...} I can't get the shorthand to be replaced by a normal hyphen, as I expected it.

If I replace it with the normal hyphen sign, the second word "Dokumentation" can't be broken by LaTeX anymore and hence will often cause an overfull hbox error instead.

Here is an MWE (in german language of course).

\documentclass[english,ngerman]{scrartcl}

\usepackage[style=numeric]{biblatex} \usepackage{babel} \usepackage{csquotes} \usepackage{dtk-logos}

\addbibresource{\jobname.bib}

\begin{filecontents}{\jobname.bib} @Manual{class:scrguide, title = {KOMA-Script}, author = {Kohm, Markus}, month = May, year = 2016, url = {http://www.komascript.de/~mkohm/scrguide.pdf}, langid = {ngerman}, note = {Bestandteil der Online"=Dokumentation von \TeXLive, Datei \url{scrguide.pdf}}, keywords = {manual}, } \end{filecontents}

\begin{document} Der Eintrag~\cite{class:scrguide} aus meiner Literatur"=Datenbank erscheint im Quellen"=Verzeichnis leider mit einem \verb|"=| in der Ausgabe.

\printbibliography%

\end{document}

(Dear german reader: please ignore the silly examples of the shorthand "= in the example above. They were inserted to prove, that they are replaced by the normal hyphen.)

This is the output:

enter image description here

How to solve this dilemma?

Jan
  • 5,293

1 Answers1

1

Active characters have been widely used for years in French, German, Dutch, etc. and produce some nasty side-effects from time to time. With the TeX or pdfTeX engines, you have to live with them.

Fortunately, LuaTeX provides tools to get rid of them, I did so for babel-french and looking at your report which follows this one by (Denis Bitouzé), I decided to check if the German dblquote could also be left inactive with LuaTeX.

I wrote the following 'dblquote.sty' file as a ``proof of concept''; as is, it seems to work but as I am not a native German speaker, I can't tell if this Lua code (after improvement) could replace the current code in (n)german.ldf files.

To give it a try, just add \usepackage{dblquote} to your preamble and a line \shorthandoff{"} just after \begin{document}.

\ProvidesPackage{dblquote}
                [2021/07/04 v.0.01 Daniel Flipo]
\NeedsTeXFormat{LaTeX2e}[2021/06/01]
\ifdefined\directlua
  \RequirePackage{luatexbase,luacode}
\else
  \PackageError{This package is meant for LuaTeX only! Aborting}
               {No more information available, sorry!}
\fi
\newattribute\DQ     \DQ=1 \relax
\newattribute\toss   \toss=0 \relax
\ifluatex
  \def\mdqon{\DQ=1\relax}
  \def\mdqoff{\DQ=0\relax}
%\else
%  \def\mdqon{\shorthandon{"}}
%  \def\mdqoff{\shorthandoff{"}}
\fi

\begin{luacode} dblquote = { } local DQ = luatexbase.attributes['DQ'] local toss = luatexbase.attributes['toss'] local has_attribute = node.has_attribute local traverse_id = node.traverse_id local remove = node.remove local insert_before = node.insert_before local insert_after = node.insert_after local current_attr = node.current_attr local new_node = node.new local copy_node = node.copy local copy_list = node.copy_list local node_id = node.id local DISC = node_id("disc") local HLIST = node_id("hlist") local GLUE = node_id("glue") local GLYPH = node_id("glyph") local KERN = node_id("kern") local PENALTY = node_id("penalty") local nobreak = new_node(PENALTY,0) nobreak.penalty = 10000 local hskip0 = new_node(GLUE,0)

-- Replace "a with ä etc. dblquote.replace = function (head) local t = { } t[string.byte("'")] = 0x201C t[0x2019] = 0x201C -- quoteright (Ligatures=TeX) t[string.byte("`")] = 0x201E t[0x2018] = 0x201E -- quoteleft (Ligatures=TeX) t[string.byte("<")] = 0x00AB t[string.byte(">")] = 0x00BB t[string.byte("A")] = 0x00C4 t[string.byte("a")] = 0x00E4 t[string.byte("E")] = 0x00CB t[string.byte("e")] = 0x00EB t[string.byte("I")] = 0x00CF t[string.byte("i")] = 0x00EF t[string.byte("O")] = 0x00D6 t[string.byte("o")] = 0x00F6 t[string.byte("U")] = 0x00DC t[string.byte("u")] = 0x00FC t[string.byte("S")] = 0x0053 t[string.byte("Z")] = 0x005A for item in traverse_id(GLYPH, head) do local lang = item.lang local char = item.char local DQon = has_attribute(item, DQ) DQon = DQon and DQon > 0 local tossON = has_attribute(item, toss) tossON = tossON and tossON > 0 if (lang == DE or lang == DEn) and DQon and (char == 0x201D or char == 0x22) then local next = item.next local nchar = next.char if tossON then t[string.byte("s")] = string.byte("s") t[string.byte("z")] = string.byte("z") else t[string.byte("s")] = 0x00DF t[string.byte("z")] = 0x00DF end if t[nchar] then next.char = t[nchar] if t[nchar] == string.byte("s") or t[nchar] == string.byte("z") then item.char = string.byte("s") elseif t[nchar] == string.byte("S") or t[nchar] == string.byte("Z") then item.char = string.byte("S") else head = remove(head,item) end end end end return head end -- Hyphenation and ligatures dblquote.disc = function (head) local to = { } to[string.byte("f")] = true to[string.byte("F")] = true to[string.byte("l")] = true to[string.byte("L")] = true to[string.byte("m")] = true to[string.byte("M")] = true to[string.byte("n")] = true to[string.byte("N")] = true to[string.byte("p")] = true to[string.byte("P")] = true to[string.byte("r")] = true to[string.byte("R")] = true to[string.byte("t")] = true to[string.byte("T")] = true for item in node.traverse_id(GLYPH, head) do local lang = item.lang local char = item.char local DQon = has_attribute(item, DQ) DQon = DQon and DQon > 0 if DQon and (char == 0x201D or char == 0x22) then -- traditionnal German only if lang == DE then local n = item.next local nn = n.next local nchar = n.char local prev = item.prev -- "ck and "CK if nchar == string.byte("c") or nchar == string.byte("C") then head = remove(head,n) head = remove(head,item) -- building d.pre looks clumsy, to be improved! local pre = new_node(HLIST,0) local hyph = copy_node(nn) hyph.char = string.byte("-") local first = copy_node(nn) local second = copy_node(hyph) pre.head = first first.next = second second.next = nil ---------------------------------------------- local d = new_node(DISC,0) d.attr = current_attr() d.penalty = tex.hyphenpenalty d.pre = copy_list(first) d.replace = copy_node(n) insert_after(head,prev,copy_node(d)) insert_after(head,prev,copy_node(nobreak)) insert_before(head,nn,copy_node(nobreak)) insert_before(head,nn,copy_node(hskip0)) -- all others "ff, "FF, etc. elseif to[nchar] and nn.char and nn.char == n.char then head = remove(head,n) head = remove(head,item) local d = new_node(DISC,0) d.attr = current_attr() d.penalty = tex.hyphenpenalty -- building d.pre looks clumsy, to be improved! local pre = new_node(HLIST,0) local hyph = copy_node(nn) hyph.char = string.byte("-") local first, second, third -- "f is special if nchar == string.byte("f") then local ff = copy_node(n) ff.char = 0xFB00 -- ligature "ff" first = copy_node(ff) second = copy_node(hyph) second.next = nil d.post = copy_node(n) d.replace = copy_node(ff) head = remove(head,nn) else first = copy_node(n) second = copy_node(n) third = copy_node(hyph) third.next = nil d.replace = copy_node(n) end pre.head = first first.next = second second.next = third -- d.attr = current_attr() d.pre = copy_list(first) d.penalty = tex.hyphenpenalty insert_after(head,prev,copy_node(hskip0)) insert_after(head,prev,copy_node(nobreak)) insert_after(head,prev,copy_node(d)) insert_after(head,prev,copy_node(nobreak)) end end -- modern and traditionnal German if lang == DE or lang == DEn then local n = item.next local nchar if n.id == GLYPH then nchar = n.char elseif n.id == PENALTY then -- n = node ~ nchar = 1 end local prev = item.prev if nchar == string.byte("-") then local d = new_node(DISC,0) d.attr = current_attr() d.penalty = tex.hyphenpenalty local hyph = copy_node(n) hyph.char = string.byte("-") d.pre = copy_node(hyph) head = remove(head,n) head = remove(head,item) insert_after(head,prev,copy_node(hskip0)) insert_after(head,prev,copy_node(nobreak)) insert_after(head,prev,copy_node(d)) insert_after(head,prev,copy_node(nobreak)) elseif nchar == string.byte("|") then local d = new_node(DISC,0) d.attr = current_attr() d.penalty = tex.hyphenpenalty local hyph = copy_node(n) hyph.char = string.byte("-") d.pre = copy_node(hyph) local k = new_node(KERN,1) k.attr = current_attr() k.kern = 20000 d.replace = copy_node(k) head = remove(head,n) head = remove(head,item) insert_after(head,prev,copy_node(hskip0)) insert_after(head,prev,copy_node(nobreak)) insert_after(head,prev,copy_node(d)) insert_after(head,prev,copy_node(nobreak)) elseif nchar == string.byte('"') then head = remove(head,n) head = remove(head,item) insert_after(head,prev,copy_node(hskip0)) elseif nchar == 1 then -- ("~) local hyph = copy_node(item) hyph.char = string.byte("-") head = remove(head,n.next) head = remove(head,n) head = remove(head,item) insert_after(head,prev,copy_node(nobreak)) insert_after(head,prev,copy_node(hyph)) elseif nchar == string.byte("=") then local hyph = copy_node(n) hyph.char = string.byte("-") head = remove(head,n) head = remove(head,item) insert_after(head,prev,copy_node(hskip0)) insert_after(head,prev,copy_node(hyph)) insert_after(head,prev,copy_node(nobreak)) elseif nchar == string.byte("/") then local d = new_node(DISC,0) d.attr = current_attr() d.penalty = tex.hyphenpenalty local slash = copy_node(n) slash.char = string.byte("/") d.pre = copy_node(slash) d.replace = copy_node(slash) head = remove(head,n) head = remove(head,item) insert_after(head,prev,copy_node(hskip0)) insert_after(head,prev,copy_node(nobreak)) insert_after(head,prev,copy_node(d)) insert_after(head,prev,copy_node(hskip0)) insert_after(head,prev,copy_node(nobreak)) end end end end return head end return dblquote.replace, dblquote.disc \end{luacode}

\directlua{% DE = \the\l@german ; DEn = \the\l@ngerman ; luatexbase.add_to_callback ("pre_linebreak_filter",dblquote.disc,"discretionary",1) luatexbase.add_to_callback ("pre_linebreak_filter",dblquote.replace,"replace",1) luatexbase.add_to_callback ("hpack_filter",dblquote.replace,"replace") }

\endinput`

Daniel Flipo
  • 2,059
  • Note there is a built-in mechanism to perform these kinds of transformations. Search the manual for ‘Transforms’. – Javier Bezos Jul 04 '21 at 11:10
  • @JavierBezos I wonder whether the German specific hyphenations for ck, ff, etc. are systematic or word depending. In the first case Babel's `Transforms' would be best, in the second case specific coding from the user is required (traditionally with "). The active " is (was?) also used to enter ä,ü,ö,Ä,Ü,Ö,ß, but all of them can be entered directly on a French AZERTY keyboard, I guess it is also possible on a German QWERTZ keyboard… IMHO it would be nice to get rid of the active " in German when compiling with LuaTeX. – Daniel Flipo Jul 04 '21 at 16:41
  • Shorthand-like transforms are also possible. For example: \babelposthyphenation{ngerman}{ "ff }{ remove, { no = f, pre = ff- }, {} }. (Very likely this is best done at the prehyphenation stage, but here is the idea.) @DanielFlipo – Javier Bezos Jul 04 '21 at 18:41
  • Sorry, for the late reply. For me, your elaborate code didn't work and didn't remove any "= codes :-( – Jan Feb 21 '22 at 17:59
  • @Jan Your file compiles here with lualatex. The call \usepackage{dblquote} has to be done /after/ \usepackage{ babel} and the line \shorthandoff{"} just after \begin{document}. I get Online-Dokumentation not Online"=Dokumentation in the bibliography. – Daniel Flipo Feb 22 '22 at 19:07