1

There were many posts asking about breaking long string of text/digits (with no spaces) into justified lines. This post is one example. The mostly-agreed-upon solution seems to be adding a zero-width glue with a little stretch between every pair of characters within that string; maybe throw in an extra discretionary to insert/customize hyphens at line ends.

But none of the solutions I saw addressed kerning. Here’re two scenarios where I’d like to preserve kerning:

  1. Long string of digits (say, 100 digits of pi), set in proportional old-style figures. You see, font designers often kern between 7 (seven.osf) and 4 (four.osf), among many other pairs. Example font: Source Serif 4.
  2. Long string of kanas, set in a Japanese font with palt and kern OpenType features turned on. You see, font designers often specify proportional alternate widths for フ and ォ, and also kern between them, among many other kanas and kana pairs. Example font: Source Han Serif. (Yes, I’m aware that LuaTeX-ja already implements this.)

Can kerning be preserved somehow? eTeX/XeTeX solutions preferred, but LuaTeX solutions also welcome.

% run with XeLaTeX or LuaLaTeX
\documentclass{article}
\usepackage{fontspec}

\setmainfont{Source Serif 4}[Numbers={Proportional,OldStyle}]

\newcommand*+{\hskip 0pt plus 1pt \relax}

\begin{document}

3.14159265358979323846264338327950288419716, figures kerned, cannot be wrapped.

3+.+1+4+1+5+9+2+6+5+3+5+8+9+7+9+3+2+3+8+4+6+2+6+4+3+3+8+3+2+7+9+5+0+2+8+8+4+1+9+7+1+6, can be wrapped, figures no longer kerned.

\end{document}

Ruixi Zhang
  • 9,553
  • Can you clarify your question a bit more? Maybe group it into parts. For long numbers like pi or Mersenne Numbers one would write functions to print them maybe grouped into threes. For DNA sequences maybe require different grouping or no grouping. For cases where you need to preserve kerns introduced by the font designer. Microtype uses something similar to what you want by add kerning or protrusion in configs specific to fonts. – yannisl Nov 15 '23 at 08:18
  • @YiannisLazarides Example added. – Ruixi Zhang Nov 15 '23 at 15:00
  • Thanks will check it out tomorrow as is getting late here – yannisl Nov 15 '23 at 16:22
  • Please have a look at the solution. – yannisl Nov 15 '23 at 17:22

2 Answers2

2

You can save the kerns after you have split your data in a sequence.

\documentclass{article}
\usepackage{fontspec}

\setmainfont{Source Serif Pro}[Numbers={Proportional,OldStyle}]

\ExplSyntaxOn

\seq_new:N \l__ruixi_split_items_seq \seq_new:N \l__ruixi_split_kerns_seq

\NewDocumentCommand{\Split}{m} { \seq_set_split:Nnn \l__ruixi_split_items_seq {} { #1 } \seq_clear:N \l__ruixi_split_kerns_seq \int_step_inline:nnn { 2 } { \seq_count:N \l__ruixi_split_items_seq } { \hbox_set:Nn \l_tmpa_box { \seq_item:Nn \l__ruixi_split_items_seq { ##1-1 } \seq_item:Nn \l__ruixi_split_items_seq { ##1 } }% with kern \hbox_set:Nn \l_tmpb_box { \seq_item:Nn \l__ruixi_split_items_seq { ##1-1 } \kern0pt \seq_item:Nn \l__ruixi_split_items_seq { ##1 } }% without kern \seq_put_right:Ne \l__ruixi_split_kerns_seq { \dim_eval:n { \box_wd:N \l_tmpa_box - \box_wd:N \l_tmpb_box } } } \seq_put_right:Nn \l__ruixi_split_kerns_seq { 0pt } \seq_map_indexed_inline:Nn \l__ruixi_split_items_seq { ##2 \hspace{\seq_item:Nn \l__ruixi_split_kerns_seq { ##1 } plus 0.3pt} } \unskip } \ExplSyntaxOff

\begin{document}

3.14159265358979323846264338327950288419716

\Split{3.14159265358979323846264338327950288419716}

\parbox[t]{3cm}{\Split{3.14159265358979323846264338327950288419716}}

\end{document}

Of course, if you use this inside a paragraph, the glue can participate to the stretching of the line.

enter image description here

If I set a box containing \Split{3.14159}} I get

\hbox(5.39+1.4)x31.42, direction TLT
.\TU/SourceSerifPro(0)/m/n/10 
.\glue -0.19 plus 0.3
.\TU/SourceSerifPro(0)/m/n/10 .
.\glue -0.38 plus 0.3
.\TU/SourceSerifPro(0)/m/n/10 
.\glue 0.0 plus 0.3
.\TU/SourceSerifPro(0)/m/n/10 
.\glue 0.0 plus 0.3
.\TU/SourceSerifPro(0)/m/n/10 
.\glue 0.0 plus 0.3
.\TU/SourceSerifPro(0)/m/n/10 
.\glue 0.0 plus 0.3
.\TU/SourceSerifPro(0)/m/n/10 

The same without \Split

\hbox(5.39+1.4)x31.42, direction TLT
.\TU/SourceSerifPro(0)/m/n/10 
.\kern-0.19 (font)
.\TU/SourceSerifPro(0)/m/n/10 .
.\kern-0.38 (font)
.\TU/SourceSerifPro(0)/m/n/10 
.\TU/SourceSerifPro(0)/m/n/10 
.\TU/SourceSerifPro(0)/m/n/10 
.\TU/SourceSerifPro(0)/m/n/10 
.\TU/SourceSerifPro(0)/m/n/10 
egreg
  • 1,121,712
  • Souldn't line1 and line3 figures in your image line up? The way I understood the question the OP wanted to preserve the kerns introduced by the font as kerning pairs. – yannisl Nov 15 '23 at 17:50
  • @YiannisLazarides No, why should they? In line 3 a width of 3cm has been imposed. – egreg Nov 15 '23 at 18:43
  • Not too sure about my statement but I think the font kerning pairs were lost when you splitted the sequence. \seq_set_split:Nnn ⟨seq var⟩ {⟨delimiter⟩} {⟨token list⟩} Maybe if you could some explanations in your code, I can grasp it :) – yannisl Nov 15 '23 at 19:53
  • I think you should have allowed skips only after pairs of digits. – yannisl Nov 15 '23 at 19:55
  • @YiannisLazarides No, between any two items there is \hspace{... plus 0.3pt} See the edit – egreg Nov 15 '23 at 20:15
  • Yes I see this but what about OP item ... 2. designers often specify proportional alternate widths for フ and ォ, and also kern between them, among many other kanas and kana pairs. I am curious about how this is achieved. Not too sure if the font we used has these kerns in any case. – yannisl Nov 15 '23 at 20:23
  • 1
    Thanks, I see the edit. I am going to have a glass of wine and toast to your good health and sleep! Have a nice evening. – yannisl Nov 15 '23 at 20:30
0

This is a solution for digits, but there is no reason it shouldn't work with characters. The secret sauce is that I used mod so the space is only inserted after groups of digits, so pairs are preserved withing the group and hence the font kerning. With cjk and other complex utf cases a Lua solution would be better to do more checks. Presentation, I used common practice in number theory books to move the 3.4 a bit out to the left.

% run with XeLaTeX or LuaLaTeX
\documentclass{article}
\usepackage{fontspec}

\setmainfont{Source Serif 4}[{Numbers=Proportional,OldStyle}]

\newcommand*+{\hskip 0pt plus 1pt \relax}

\begin{document}

3.14159265358979323846264338327950288419716, figures kerned, cannot be wrapped.

3+.+1+4+1+5+9+2+6+5+3+5+8+9+7+9+3+2+3+8+4+6+2+6+4+3+3+8+3+2+7+9+5+0+2+8+8+4+1+9+7+1+6, can be wrapped, figures no longer kerned.

\ExplSyntaxOn \cs_set:Npn\spaceone{\hspace{5pt}} %\cs_set:Npn\spacetwo #1{\fbox{#1}\hskip1sp} \tl_put_left:Nn\l_tmpa_tl{14159265358979 323846264338327950288419716} \fboxsep=0pt\fboxrule=0sp \int_set:Nn\l_tmpa_int{0}

\def\insertskip #1 {#1 \int_incr:N\l_tmpa_int \int_set:Nn\l_tmpb_int{ \int_mod:nn{\int_use:N\l_tmpa_int}{4} } \int_compare:nNnTF{\int_use:N\l_tmpb_int}=0{\int_gset:Nn\l_tmpa_int{0}\spaceone}{} }

Output:\par \DeclareDocumentCommand\Out{}{ \parindent0pt \begin{minipage}[t]{1em} 3. \end{minipage} \begin{minipage}[t]{4cm}\large \tl_map_function:NN\l_tmpa_tl\insertskip \end{minipage}
} \ExplSyntaxOff \Out \end{document}

Top figure without Proportional, bottom with Proportional. enter image description here

enter image description here

yannisl
  • 117,160