15

I'm using UTF-8 so it is – if one is using the right font – no problem to implement the different types of s in older German texts (s, ſ, ß). Unfortunately the hyphenation breaks because LaTeX does not know that ſ has to be dealt with just the same as it would do when dealing with "s".

MWE:

\documentclass{article}
\usepackage[ngerman]{babel}

\begin{document}
XXX Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft

XXX Geſellſchaft Geſellſchaft Geſellſchaft Geſellſchaft Geſellſchaft Geſellſchaft Geſellſchaft Geſellſchaft Geſellſchaft

\hyphenation{Ge-ſell-ſchaft}

XXX Geſellſchaft Geſellſchaft Geſellſchaft Geſellſchaft Geſellſchaft Geſellſchaft Geſellſchaft Geſellſchaft Geſellſchaft

\end{document}

As one can see at the end of the line, I have to add the correct hyphenation manually. Any idea how I can solve this?

enter image description here

David Carlisle
  • 757,742
  • 2
    you could make this work automatically in luatex or xetex but in pdftex the best you could do is define ſ to always allow a hyphenation after it, it can not take part in \patterns or \hyphenation – David Carlisle Aug 06 '18 at 14:45
  • 4
    your choices would be essentially to (a) use \hyphenation as you have done for all necessary words or (b) copy the hyphenation patterns adding patterns for long s to match those for s and rebuild the xelatex format (lulatex does not need to be rebuilt) or (c) use s in the original markup and then set up font features (perhaps...) so that some s get typeset using the long form. Which would you prefer? (the last probably depends on the font you use) – David Carlisle Aug 06 '18 at 15:29
  • I use XeLaTeX, so copying the patterns is definitely the way to go. I'll look it up, right now I'm not sure how it works. (c) is a creative solution but does not work as I have parts where I want the long s and parts were I don't as I have to stay true to the source – Martin Mueller Aug 06 '18 at 15:37
  • oh or if you are using luatex then (d) implement a Lua hyphenation callback that hyphenates using s, then switches to long s and re-inserts the hyphenation points, – David Carlisle Aug 06 '18 at 15:37
  • If the list of problematic patterns is short, in luatex you can also add them with \babelpatterns in the document itself. – Javier Bezos Aug 06 '18 at 17:38

2 Answers2

12

For LuaTeX here is an implementation of David Carlisles idea to create a hypenate callback. It works by replacing every ſ with a marked s before hyphenation and then recovering the original characters after hyphenation:

\documentclass{article}
\usepackage[ngerman]{babel}
\usepackage{luacode}
\begin{luacode*}
local sattr = luatexbase.new_attribute("longsattr")
local disc = node.id'disc'
print('DISC', disc)

local function long_to_s(head, tail)
  for n in node.traverse(head) do
    if n == tail then break end
    if n.id == disc then
    print(n)
      long_to_s(n.pre)
      long_to_s(n.post)
      long_to_s(n.replace)
    end
    if n.char == 383 then
      n.char = 115
      node.set_attribute(n, sattr, 383)
    end
  end
end
local function s_to_long(head, tail)
  for n in node.traverse(head) do
    if n == tail then break end
    if n.id == disc then
      s_to_long(n.pre)
      s_to_long(n.post)
      s_to_long(n.replace)
    end
    local a = node.get_attribute(n, sattr)
    if a then
      n.char = a
      node.unset_attribute(n, sattr)
    end
  end
end
local function myhyph(head, tail)
  long_to_s(head, tail)
  lang.hyphenate(head, tail)
  s_to_long(head, tail)
end
luatexbase.add_to_callback("hyphenate",myhyph,"hyphenate with modified s")
\end{luacode*}
\begin{document}
XXX Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft

XXX Geſellſchaft Geſellſchaft Geſellſchaft Geſellſchaft Geſellſchaft Geſellſchaft Geſellſchaft Geſellſchaft Geſellſchaft
\end{document}

enter image description here

LuaTeX also allows you to manipulate the hyphenation pattern during a run, so you can also use (this is an automated version of David Carlisles choice (b)):

\documentclass{article}
\usepackage[ngerman]{babel}
\usepackage{luacode}
\begin{luacode*}
  local l = lang.new(tex.language)
  l:patterns(l:patterns():gsub('s', 'ſ'))
\end{luacode*}
\begin{document}
XXX Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft

XXX Geſellſchaft Geſellſchaft Geſellſchaft Geſellſchaft Geſellſchaft Geſellſchaft Geſellſchaft Geſellſchaft Geſellſchaft
\end{document}

enter image description here

3

A simple way to do this is to choose a font that supports ſ as an open type character variant, e.g., EB Garamond. Then you can just select that variant when you need it.

(Re-reading the comments above, I see this corresponds to option (c) from David Carlisle, which you said wasn't suitable, but this MWE shows you can have both kinds of s with this method.)

Update showing iſt and ſelbes.

\documentclass{article}
\usepackage[ngerman]{babel}
\babelfont{rm}{EB Garamond}
\begin{document}
XXX Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft

\addfontfeature{CharacterVariant=1}

XXX Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft

ist selbes
\end{document}

enter image description here

David Purton
  • 25,884
  • You're right about the font. Though dis is not exactly David Carlisles option c and is no proper way to do it, because though all ſ are s, not all s are ſ. The rule is, i think, that at the beginning of a word and in between letters it's an ſ – if not a long vocal is right before it. So it's »iſt« and »ſelbes«. – Martin Mueller Aug 07 '18 at 08:36
  • @MartinMueller, EB Garamond does not change it at the end of a word, but I think it changes it everywhere else. I'm not a German speaker though, so I do not know what it should be. – David Purton Aug 07 '18 at 08:44
  • Thank you for pointing that out. I've installed the EB Garamond but unfortunately the rules of ſ are far more complex than I knew (see wikipedia) and using your method does not produce the desired result. E.g. lesbisch should be lesbiſch but is rendered leſbiſch – Martin Mueller Aug 07 '18 at 12:20
  • @MartinMueller, that's a pity! The EB Garamond manual does warn that this feature isn't all that clever. – David Purton Aug 07 '18 at 12:30