3

I'd like to analyse the contents of an environment line by line, but the following minimalistic testing code fails by "printing" false.

\documentclass{article}

\ExplSyntaxOn

\NewDocumentEnvironment{linebyline}{b}{ \seq_new:N \l_temp_seq \regex_split:nnNTF { \n } { #1 } \l_tmpa_seq { true } { false } }{}

\ExplSyntaxOff

\begin{document}

\begin{linebyline} % Comment Line 1 Line 2 % Comment Line 3 \end{linebyline}

\end{document}

projetmbc
  • 13,315
  • 4
    end of line is tokenized as a space by default – David Carlisle Dec 17 '23 at 22:43
  • Gosh... I wanted to avoid using \\\ for line breaks. – projetmbc Dec 17 '23 at 22:47
  • 2
    tex is tex..., you need \obeylines or set \endlinechar or ... – David Carlisle Dec 17 '23 at 22:56
  • I will investigate this. Thanks for showing me the light... :-) – projetmbc Dec 17 '23 at 22:57
  • 3
    And don't use regex for this. Splitting at a single character can be done with \seq_set_split:Nnn or \seq_set_split_keep_spaces:Nnn. – Skillmon Dec 18 '23 at 08:45
  • @Skillmon You are right, but this is just a toy example... – projetmbc Dec 18 '23 at 13:08
  • 2
    @projetmbc still shows a tool too complicated and wasteful for the job... Your request never needed regex if it's just about splitting lines, and I wanted to make sure you realise this. l3regex is brilliant code, don't get me wrong, but it is used "in the wild" way too often for things simpler tools can do just as well yet hundreds of times faster. – Skillmon Dec 18 '23 at 13:23
  • 2
    Due to a precise design choice of TeX, how the input is split across lines is completely irrelevant, so long as line breaks correspond to spaces in output. So it's essentially meaningless to “analyze the input line-by-line”. – egreg Dec 19 '23 at 15:19

2 Answers2

2

Here's a solution with a command rather than an environment, which is not my need, and using regular expressions instead of seq_set_split:Nnn or seq_set_split_keep_spaces:Nnn.


Any advice to a split-seq solution is welcome.


\documentclass{article}

\ExplSyntaxOn

\NewDocumentCommand{\linebyline}{+v}{ \seq_new:N \l_temp_seq \regex_split:nnN {^^M} {#1} \l_tmpa_seq \seq_use:Nn \l_tmpa_seq { :: } }{}

\ExplSyntaxOff

\begin{document}

\linebyline{ % Comment Line 1 Line 2 % Comment Line 3 }

\end{document}

This code outputs:

::% Comment::Line 1::Line 2::% Comment::Line 3::
projetmbc
  • 13,315
1

I'm not really sure what you are looking for, but with \obeylines in place within the environment, a token cycle can be used to search for the line ends and emplace (as in your example) a :: between lines.

\documentclass{article}
\usepackage{tokcycle}
{\obeylines
\gdef\mycr{
}}
\def\myenvname{linebyline}
\newenvironment{\myenvname}{\obeylines\catcode`\%=12 \tokencycle
  {\addcytoks{##1}}
  {\processtoks{##1}}
  {%
  \expandafter\ifx\mycr##1\addcytoks{::}\else
    \ifx\end##1
      \tcpop\z\tcpushgroup\z%
      \ifx\z\myenvname
        \tcpush{\noexpand\endtokcycraw##1}%
      \else\addcytoks{##1}\fi
    \else\addcytoks{##1}\fi
  \fi}
 {\addcytoks{##1}}}{}
\begin{document}
\begin{linebyline}
% Comment
    Line 1
    Line 2
% Comment
    Line 3
\end{linebyline}

Back to normal text.

\begin{linebyline} % Comment Line 1 \today Line 2\begin{itemize} \item xxx\end{itemize} % Comment Line 3 \end{linebyline}

Back to % absolutely normal text. \end{document}

enter image description here

If one desires the parsed content to not be executed, but instead detokenized, and the line content of each line collected, one can do this:

\documentclass{article}
\usepackage[T1]{fontenc}
\usepackage{tokcycle}
{\obeylines
\gdef\mycr{
}}
\def\myenvname{linebyline}
\newenvironment{\myenvname}{\obeylines\catcode`\%=12 \tokencycle
  {\addcytoks{\string##1}}
  {\addcytoks{\{}\processtoks{##1}\addcytoks{\}}}
  {%
  \expandafter\ifx\mycr##1
    \mbox{}\\Input line: ``\the\cytoks''% <-CURRENT INPUT LINE
    \cytoks{}%
  \else
    \ifx\end##1
      \tcpop\z\tcpushgroup\z%
      \ifx\z\myenvname
        \tcpush{\noexpand\endtokcycraw##1}%
      \else\addcytoks{\detokenize{##1}}\fi
    \else\addcytoks{\detokenize{##1}}\fi
  \fi}
  {\addcytoks{##1}}}{}
\begin{document}
\begin{linebyline}
% Comment
    Line 1
    Line 2
% Comment
    Line 3
\end{linebyline}

Back to normal text.

\begin{linebyline} % Comment Line 1 \today Line 2\begin{itemize} \item xxx\end{itemize} % Comment Line 3 \end{linebyline}

Back to % absolutely normal text. \end{document}

enter image description here

  • 1
    Thanks for this proposition. Concretely, I want to use a DSL (Domain Specific Language) to type easily tables of variation and/or signs of real functions. In my partial answer, the l3 sequence will serve to analyse each line of the DSL. – projetmbc Dec 19 '23 at 07:24
  • 1
    @projetmbc In my implementation, the code of each line is actually executed (as in the itemize). However, I could just as easily have \stringed the output, making it more verbatim-like, if that were preferable. – Steven B. Segletes Dec 19 '23 at 19:24
  • 1
    @projetmbc See my edit. – Steven B. Segletes Dec 19 '23 at 19:31