4

In the following MWE, I defined my own ligature and it works using LuaTeX. I now would like to make it searchable by "Th" instead of "Ђ". Is this possible using fontspec or LuaTeX methods?

  • I already saw many similar questions (like this one) but none was answered in a generic approach to solve this problem.
  • I think, has to do with using an equivalent of pdfglyphtounicode in LuaTeX
\documentclass[a4paper]{article}
\usepackage{fontspec}

\directlua {

fonts.handlers.otf.addfeature
{
    name = "ligaxits",
    type = "ligature",
    data =
    {
        [0x0402] = { "T", "h" },
    },
}

}

\begin{document}

\setmainfont[
    RawFeature={+ligaxits},
]{XITS-Regular.otf}

The% the "ligature" is now used as expected, but I'd like to make it searchable by "Th" 

\end{document}

custom ligature

Guest
  • 131
  • 8
  • The method described here should work: https://tex.stackexchange.com/a/454877/2388. You need the value ["uni0402"] = {84, 104}. – Ulrike Fischer Nov 15 '21 at 16:40
  • Thank you for pointing me in this direction, unfortunately I couldn't get it working so far. Even if it worked, I see a problem: how could I patch more than one (the main font)? In the linked code the font is not named explicitly so I assume it will target only the main font or all fonts. I would need a separate patching for each font since they have custom ligatures on different random positions :( – Guest Nov 15 '21 at 18:11
  • you can add a test for a font name, see e.g. https://tex.stackexchange.com/a/420568/2388. But for such ligatures it is probably not needed, you will always want to map the name uni0402 to T+h. – Ulrike Fischer Nov 15 '21 at 18:37
  • Thank you, I will try this! Yes, in the given MWE example it will always work, but I'm using different fonts with complex ligatures and also alternative ligatures etc. in my real world example. – Guest Nov 15 '21 at 18:42
  • I can now confirm that it works with the linked example using "SourceSerifPro-Regular.otf" in Small Caps mode. I successfully mapped the U to the given T h values and it works. But for now I didn't have success on the STIX font. It seems it also does not work with the Source Serif font in normal letters (non small caps) – Guest Nov 15 '21 at 19:30
  • check in the xits-regular.lua in texmf-var/luatex-cache if the name is correct. – Ulrike Fischer Nov 15 '21 at 19:33
  • I try to find the file, but I think the name for U is always just "U" so it should work, but does only with U.sc. But I don't know why it even fails on Source Serif for non small caps... E: according to the lua file the names are ok... – Guest Nov 15 '21 at 19:39
  • Hm, I have no success with the XITS-Regular.otf font, but I'm using the correct names. – Guest Nov 15 '21 at 19:53
  • try with the numbers instead, 0x0402. – Ulrike Fischer Nov 15 '21 at 23:19
  • I now found the problem: the complete patch function is only executed, if the code contains some special lookup requirements like small caps \textsc{...}. But when the document has common unicode chars like A or Ђ it's just ignored. I validated this using prints to console using this: https://stackoverflow.com/a/27028488 – Guest Nov 17 '21 at 21:59

1 Answers1

3

Thanks to Ulrike Fischer who pointed me in the right direction, I finally got it working, probably for all glyphs and all fonts. I think the code is even faster than the linked one (O(1) compared to O(N), but I'm a Lua beginner), and the linked one doesn't work for all glyphs, because not all glyphnames are in the table especially those with prefix uni are missing.

\documentclass[a4paper]{article}

\usepackage{fontspec} \usepackage{luacode}

\begin{luacode} -- the following code is for creating the ligature only; not for making it copyable/searchable fonts.handlers.otf.addfeature{ name = "ligacustom", type = "ligature", data = { [utf.byte("Ђ")] = {utf.byte("T"), utf.byte("h")}, }, }

-- the following code is for debugging only local dump = function(o) -- source: https://stackoverflow.com/a/27028488 if type(o) == 'table' then local s = '{ ' for k,v in pairs(o) do if type(k) ~= 'number' then k = '"'..k..'"' end s = s .. '['..k..'] = ' .. dump(v) .. ',' end return s .. '} ' else return tostring(o) end end

-- the following code is for making generic glyphs copyable/searchable as wanted; even from private use area local patch_make_custom_glyphs_searchable_xits = function(fontdata) if fontdata.fontname == "XITS-Regular" -- when the patch should only apply to XITS Regular font then -- for another font you can print the font name to console using print(fontdata.fontname) -- add as many as you want; utf.byte("ß") is same as python3 ord("ß") for testing fontdata.characters[utf.byte("Ђ")]["tounicode"] = {utf.byte("T"), utf.byte("h")} fontdata.characters[utf.byte("ß")]["tounicode"] = {utf.byte("Ä"), utf.byte("Ä"), utf.byte("Ä")} end -- print(dump(fontdata.characters)) end

luatexbase.add_to_callback ( "luaotfload.patch_font", patch_make_custom_glyphs_searchable_xits, "patch_make_custom_glyphs_searchable_xits" ) \end{luacode}

\begin{document} % setmainfont after the lua code! \setmainfont[ RawFeature={+ligacustom}, ]{XITS-Regular.otf}

Ђ The ß% copies in Sumatra PDF and Adobe Reader as "Th The ÄÄÄ" using "XITS-Regular.otf"

\end{document}

sumatra pdf

Guest
  • 131
  • 8