In other words,
How can I allow breaks after any visible character? ...[such that lines are broken when the sum of the lengths of horizontal boxes equals
\linewidth]How can I disable or influence the TeX spacing/line-breaking algorithm to allow for breaks whenever the line is full?
This is a follow-up to a question asked earlier How can I make LaTeX to recognize spaces in my macro (catcode 10)?.
I ran into two problems:
- Dealing with the active character
- Dealing with memory
The active character causes issues (as expected). David Carlisle mentioned that I could use \noexpand to properly handle the active character. I tried, but to no avail. Furthermore, because the entire text is fed into the macro, the whole thing is stored in memory before it is actually shipped out.
How this thing works:
\treateachletteraswordsends its entire input to a scanner (this is the memory hog)- The scanner,
\xscan, sends a single token to\xxscanby grabbing a token with\afterassignment \xxscantests for catcodes and works accordingly- Finally,
\xxscancalls\xscanon the next token (this loop breaks when the equivalent of\relaxis encountered-may or may not be the best solution)
Code
(David Carlisle mentioned that instead of \hskip 0pt plus 1sp minus 1sp, I should use \hskip 0pt, or even better, \penalty0 to inprove efficiency. This way, TeX does not need to calculate as much for each line.)
\documentclass{article}
\usepackage{fontspec}
\newfontfamily\monofont{Inconsolatazi4-Regular.otf}% let's pretend this is not a mono font for practice
%\makeatletter % or \catcode"0040=11
\long\def\treateachletterasword#1{\xscan #1\relax}% calls initial \xscan on first char
\def\xscan{\afterassignment\xxscan\let\token= }% assign next single token to \token and send it to \xxscan
\def\xxscan{%
\monofont% apply mono font
\ifx\token\relax\normalfont\else%test for end-of-line or end of group and switch back to normal font
\ifcat\token\space%
\token% token is catcode 10
\spaceskip=.5em% remove glue from space for fixed-width space and precise control
\xspaceskip=.5em% remove glue from space for fixed-width space and precise control
\else%
\ifcat\token\active% deal with active character backslash
%\noexpand\token
\textbackslash% for lack of a solution, this macro should be expanded
\else
\token\hskip 0pt plus 1sp minus 1sp% add glue to any non-catcode 10 (space)
\fi
\fi
\expandafter\xscan%calls subsequent xscan until next char representd by \token = \relax
\fi}
%\makeatother % or \catcode"0040=12
\parindent=0pt % remove firstpar autoindent
\obeylines% insert \par after each end-of-line (^^M)
\begin{document}
\section{SomeTeXFile.tex}\subsection{Typesetting Failure}
\treateachletterasword{
This is XeTeX, Version 3.14159265-2.6-0.99996 (TeX Live 2016) (preloaded format=xelatex 2016.6.27) 11 AUG 2017 13:15
entering extended mode
restricted \char"005C{}write18 enabled.
file:line:error style messages enabled.
\%\&-line parsing enabled.
**ThirdPartyLicenses.tex
(./ThirdPartyLicenses.tex
LaTeX2e <2016/03/31> patch level 2
Babel <3.9r> and hyphenation patterns for 83 language(s) loaded.
(../../DocumentClass.tex (/usr/local/texlive/2016/texmf-dist/tex/latex/base/article.cls
Document Class: article 2014/09/29 v1.4h Standard LaTeX document class
(/usr/local/texlive/2016/texmf-dist/tex/latex/base/size10.clo
File: size10.clo 2014/09/29 v1.4h Standard LaTeX file (size option)
)
\char"005C{}c@part=\char"005C{}count79
\char"005C{}c@section=\char"005C{}count80
\char"005C{}c@subsection=\char"005C{}count81
\char"005C{}c@subsubsection=\char"005C{}count82
\char"005C{}c@paragraph=\char"005C{}count83
\char"005C{}c@subparagraph=\char"005C{}count84
\char"005C{}c@figure=\char"005C{}count85
\char"005C{}c@table=\char"005C{}count86
\char"005C{}abovecaptionskip=\char"005C{}skip41
\char"005C{}belowcaptionskip=\char"005C{}skip42
\char"005C{}bibindent=\char"005C{}dimen102
}
\end{document}
Output
Note that
- By adding glue to every non-catcode 10 (except active char) and removing glue from every catcode 10 (space), TeX fills lines with as many boxes as it takes to fill the line without stretching spaces. This effectively disables the TeX beautifying mechanism.
- Macros should be expanded, but they are not. See
\"005C, for example. - Here is another point I forgot to mention: The new lines are ignored despite
\obeylines. This is because both}and\par(from^^M) would yield\relax. This is undesired.


\obeylines\obeyspaces\ttfamilyinstead of\treateachletterasword{}suffice? – Steven B. Segletes Aug 11 '17 at 12:19\linebreak[0](which is\penalty0) – David Carlisle Aug 11 '17 at 12:37\relax. This means that new lines are ignored despite\obeylines, since}and\par(from^^M) would yield a\relax. – Jonathan Komar Aug 11 '17 at 12:41\ifcat\token\activedoes not test if the token is active, if the token is expandable it tests the first two tokens in its expansion have the same catcode, otherwise it tests if the token is not a character (as all other non expandable tokens, including\activetest as equal to\ifcat) – David Carlisle Aug 11 '17 at 12:52\penalty0. Is there some performance benefit to using one or the other? – Jonathan Komar Aug 11 '17 at 12:53\begin{document} \def\token{abc}
\ifcat\token\space \token (token is catcode 10) \fi
\ifcat\token\active \token (token is active) \fi \end{document}`
– David Carlisle Aug 11 '17 at 13:15