I'm trying to master LaTeX3 using the example of the problem of parsing LaTeX code and then writing function values to a file.
Problem. Generate a piece of XML file in pdfLaTeX containing normalized (allowed and meaningful in XML) LaTeX variables from the current file (article title, list of authors, abstract, keywords, list of references etc.) For example, normalized text is text in which \, disappears, \& is replaced by &, ~ – by , -- – by –, string $x_2$ is replaced with x<sub>2</sub>, \emph{ ... } – with <em> ... <em> and so on. Since I do not need to convert the entire article to XML, it seems to me that it is worth just writing lines with the necessary XML tags to the file using standard TeX or LaTeX3 functions. I more or less figured out the regular expressions in expl3, but I could not write the value of the \normalize function to the file. Here, in the minimally working example, I've focused on some of the non-XML functionality I need. I'm trying to figure out how I can save to a file not the text of the functions, but just their values.
That is, in this case, I would like to get in the file exactly what I see in the first line on the screen. And why doesn't passing a value to the \normalize function work?
I would appreciate a solution and an explanation. "The LATEX3 Interfaces" is still a bit complicated for me, but I'm getting into it a little.
\documentclass{article}
\usepackage{expl3}
\ExplSyntaxOn
\tl_new:N \l_normalize_tl
\cs_new:Npn \normalize #1 {
\tl_set:Nn \l_normalize_tl {#1}
\regex_replace_all:nnN { \c{,} } { } \l_normalize_tl
\regex_replace_all:nnN { \c{&} } { \c{&}amp; } \l_normalize_tl
\regex_replace_all:nnN { ~ } { \c{&}nbsp; } \l_normalize_tl
\regex_replace_all:nnN { -- } { \c{&}ndash; } \l_normalize_tl
\regex_replace_all:nnN { \c{emph}{(.*?)} } { \c{textless} em\c{textgreater}\1\c{textless}/em\c{textgreater} } \l_normalize_tl
\tl_use:N \l_normalize_tl
}
\DeclareDocumentCommand\wout { m }
{ \iow_now:Nx \g_xml_out_iow { #1 } }
\DeclareDocumentCommand\writexml{ }
{
\iow_new:N \g_xml_out_iow
\iow_open:Nn \g_xml_out_iow { \c_sys_jobname_str.xml }
\wout{ \exp_not:V\normalize\teststring }
\iow_close:N \g_xml_out_iow
}
\ExplSyntaxOff
\begin{document}
\def\teststring{This is~-- a test document by \emph{A.,A.~Smith}.}
\normalize{This is~-- a test document by \emph{A.,A.~Smith}.}
\normalize\teststring
\writexml
\end{document}
And I see in the *.xml file
\normalize This is\protect \unhbox \voidb@x \protect \penalty \@M \ {}-- a test document by \protect \unhbox \voidb@x \bgroup \edef .{A.\,A.~Smith}\let \futurelet \@let@token \let \protect \relax \itshape A.\protect \protect \leavevmode@ifvmode \kern +.16667em\relax A.\protect \unhbox \voidb@x \protect \penalty \@M \ {}Smith\egroup .

\normalizefunction is not expandable → see my answer https://tex.stackexchange.com/questions/645995/why-cant-i-use-some-macro-inside-the-argument-of-some-other-macro , section 2. – user202729 Jul 05 '22 at 00:56