0

If I have hard to interpret error messages in a latex file, I comment out half of the file, compile, see if it works or not and so on and find the problematic part by successive bisections.

Is there an automatic way to do this? I know it may be tricky to implement, since you have to find safe places for the bisection. However I guess it is solvable in principle, so I want to know if there a solution out there already.

I am working with linux and emacs, so I would prefer a solution respecting this environment.

The question was partially inspired by the emacs package bug-hunter, which "Automatically debug(s) and bisect(s) your init (.emacs) file!" (see also here).

Edit

Here is a simple example where it seems difficult to me to find the error. In this case I would try bisecting:

\documentclass{article}
\errorcontextlines=1000
\usepackage{exsheets}
\usepackage{tikz}

\begin{document}

\begin{question} Test \end{question}

\begin{solution} \begin{tikzpicture} \draw (0,0) -- (1,1) \end{tikzpicture} \end{solution}

\begin{question} Test \end{question}

\begin{solution} Test \end{solution}

\printsolutions \end{document}

Output of the log file: https://pastebin.com/kFfWu2zL

student
  • 29,003
  • 3
    bisecting the first half is easy as you can just put \end{document} in the middle and just process half, if you need the second half that's harder and not in general possible as it may require definitions from earlier, although if you assume all definitions in the preamble it is not so bad, emacs auctex lets you preview a marked region (by copying the preamble and the region to a temporary document behind the scenes) – David Carlisle Dec 02 '21 at 17:06
  • 1
    To remove the first half, you can use \iffalse first half \fi, and then successively move done \fi. – gernot Dec 02 '21 at 17:43
  • @DavidCarlisle and gernot: Yes, that's clear. What I am looking for is an automatic solution. – student Dec 02 '21 at 21:04
  • you can't literally bisect anywhere: you have to choose safe places so automatic bisection might be tricky, it's not normally needed as the log should normally give fairly accurate indication of the location of the error: you know how many pages are output before the error, and you have the line number of the source error – David Carlisle Dec 02 '21 at 21:09
  • I don't know the package, you could give an example, you can always set \errorcontextlines=1000 and get a full error context. – David Carlisle Dec 02 '21 at 21:18
  • @DavidCarlisle: See my edit above. – student Dec 02 '21 at 21:39
  • This is definitely not possible, because of how complex TeX is. There's still compiling - Reducing the console output of LaTeX - TeX - LaTeX Stack Exchange -- – user202729 Dec 03 '21 at 02:36
  • For me it gives Undefined control sequence. // <argument> Test \myerror so that makes quite clear where the error is (search for \myerror in the source code) – user202729 Dec 03 '21 at 02:36
  • @user202729: Yes, I wanted just show an example where you see that it doesn't report a correct line number. – student Dec 03 '21 at 05:37
  • @user202729: I tried to slightly modify the example. See my edit abvove. – student Dec 03 '21 at 06:59
  • Your example is actually one were bisecting is of little use. If you drop the first half the error will be gone, and if you drop the second half, the error will be gone too. It is just "delayed content". And the log shows you clearly that the error occurs when \printsolutions is called. Line number and all. – gusbrs Dec 03 '21 at 09:39
  • There's just no way the (what you think is) correct answer could be printed in the very-general case. TeX is just too powerful. Imagine this... – user202729 Dec 03 '21 at 10:56
  • In Python (you know some conventional programming language right?) you set x="myerror()" in line 10 and exec(x) in line 20. How can Python possibly know which line the literal myerror appeared in the source code? It will report the error on line 20. – user202729 Dec 03 '21 at 10:57
  • maybe it's possible to tag a file/line number information to every token and report that, but it requires significant reimplementation of the engine and (as far as I know) nobody have done that. (and it may not even that useful) – user202729 Dec 03 '21 at 10:57
  • @gnusbr: In this case I would consider only the part before \printsolutions for bisecting. However it is clear, that it is nontrivial to come with an algorithm that automatically does such decisions. – student Dec 03 '21 at 12:51
  • @user202729: Why does it work in lisp with bug-hunter? Lisp is turing complete too. – student Dec 03 '21 at 12:52
  • 1
    I doubt it works in lisp either, here you are saving tex code unexecuted (with a syntax error) and getting an error at the point you execute it rather than at the point you save it. If you save an s-exprerssion in lisp and then evaluate it later you would similarly get an error at the eval, not at the point that you constructed the erroneous s-expression. – David Carlisle Dec 03 '21 at 15:12
  • I'm not very familiar with that particular thing, so it would be hard to discuss... (although my guess is the comment above) – user202729 Dec 03 '21 at 16:32
  • Okay I took a look at some examples, and there's nothing as complex as the store-verbatim → eval process in TeX in this case. Perhaps you can give some more complex example? – user202729 Dec 03 '21 at 16:35

1 Answers1

1

This is an explanation on the particular example, and why it's hard to debug.


What happened to the stack trace?

Because of tail-call optimization, the important error context line is missing.

Compare: if you add \relax (or anything nonempty) after the tikzpicture environment in the erroneous, a relevant context line is printed.

! Package tikz Error: Giving up on this path. Did you forget a semicolon?.

See the tikz package documentation for explanation. Type H <return> for immediate help. ...ackage tikz Error: Giving up on this path. Did you forget a semicolon?.

See the tikz package documentation for explanation. Type H <return> for immediate help@err@

\GenericError ... @empty \def \MessageBreak
#1 \def \errmessage #2.

#3 Type H <return> for immediate help@err@
\endgroup \pgfutil@next ->\advance \tikz@expandcount by -1 \ifnum \tikz@expandcount <0\relax \tikzerror {Giving up on this path. Did you forget a semicolon?} \let \pgfutil@next =\tikz@finish \else \let \pgfutil@next =\tikz@@expand \fi \pgfutil@next \pgf@let@token ->\tikz@atend@picture \global \let \pgf@shift@baseline@smuggle =\pgf@baseline \global \let \pgf@trimleft@final@smuggle =\pgf@trimleft \global \let \pgf@trimright@final@smuggle =\pgf@trimright \global \let \pgf@remember@smuggle =\ifpgfre... \end #1->\romannumeral \IfHookEmptyTF {env/#1/end}{\expandafter \z@ }{\z@ \UseHook {env/#1/end}}\csname end#1\endcsname @checkend {#1}\expandafter \endgroup \if@endpe @doendpe \fi \UseHook {env/#1/after}\if@ignore @ignorefalse \ignorespaces \fi <argument> \begin {tikzpicture} \draw (0,0) -- (1,1) \end {tikzpicture} \relax \exsheetsprintsolution #1#2->#1#2

__exsheets_surround_with:nnn #1#2#3->#2#1 #3 __exsheets_print_solution:nnn ...tl {#2}{#1}}{\exp_not:V \l__exsheets_solutions_pre_body_hook_tl \exp_not:n {#3}\exp_not:V \l__exsheets_solutions_post_body_hook_tl }}\l__exsheets_solutions_pre_hook_tl \l__exsheets_solutions_post_hook_tl \exsheets_add... __prop_map_function:Nwwn #1#2__prop_pair:wn #3\s__prop #4->#2#1{#3}{#4} __prop_map_function:Nwwn #1 \g_exsheets_question_identification_prop ->\s__prop __prop_pair:wn 1\s__prop {-0-1} __prop_pair:wn 2\s__prop {-0-2} \prop_map_function:NN #1#2->\exp_after:wN \use_i_ii:nnn \exp_after:wN __prop_map_function:Nwwn \exp_after:wN #2#1 \prg_break: __prop_pair:wn \s__prop {}\prg_break_point: \prg_break_point:Nn \prop_map_break: {} __keys_set_keyval:nnn ...path_str \s__keys_stop \l__keys_module_str \l_keys_key_str \tl_set_eq:NN \l_keys_key_tl \l_keys_key_str __keys_value_or_default:n {#3}\bool_if:NTF \l__keys_selective_bool __keys_set_selective: __keys_execute: \str_set:Nn ... __keyval_key:nn #1#2->__keyval_if_blank:w \s__keyval_mark #1\s__keyval_nil \s__keyval_stop __keyval_blank_key_error:w \s__keyval_mark \s__keyval_stop \exp_not:n {#2{#1}} __keyval_loop_other:nnw {#2} __keyval_loop_other:nnw ..._keyval_if_recursion_tail:w #3__keyval_end_loop_other:w \s__keyval_tail __keyval_split_active:w #3\s__keyval_nil \s__keyval_mark __keyval_split_active_auxi:w =\s__keyval_mark __keyval_clean_up_active:w {#1} {#2}\s__keyva... __keyval_loop_active:nnw #1#2#3,->__keyval_if_recursion_tail:w #3__keyval_end_loop_active:w \s__keyval_tail __keyval_loop_other:nnw {#1}{#2}#3, \s__keyval_tail , \keyval_parse:NNn #1#2#3->__keyval_loop_active:nnw {#1}{#2}\s__keyval_mark #3, \s__keyval_tail , __keys_set:nnn #1#2#3->\str_set:Nx \l__keys_module_str {__keys_trim_spaces:n {#2}}\keyval_parse:NNn __keys_set_keyval:n __keys_set_keyval:nn {#3} \str_set:Nn \l__keys_module_str {#1} \l__exp_internal_tl ...\l__keys_only_known_bool \bool_set_false:N \l__keys_filtered_bool \bool_set_false:N \l__keys_selective_bool \tl_set:Nn \l__keys_relative_tl {\q__keys_no_value }__keys_set:nn {exsheets/exsheets_print_solutions}{all} \tl_set:Nn \l... \exsheets_print_solutions:n ...exsheets_solutions_print_bool \bool_set_true:N \l__exsheets_inside_solution_bool \cs_set:Npn \S ##1{\exref {exse:##1}}\cs_set:Npn \C ##1{\exref {exch:##1}}\keys_set:nn {exsheets/exsheets_print_solutions}{#1} \group_end: <to be read again> \end l.32 \end {document}

Note the line

<argument> \begin {tikzpicture} \draw (0,0) -- (1,1) \end {tikzpicture}
                                                             \relax 

Which is the content you typed in.

What happened to the line number?

The code you shown in TeX is roughly equivalent to the following (pseudo)code in Python:

# ======== tikz library

def tikzPicture(s: str): if ";" not in s: raise RuntimeError("No semicolon") return "figure"

======== exsheets library

solutions=[] def addSolution(s: str): solutions.append(s)

def printSolutions(): for solution in solutions: print(eval(solution))

======== your code

addSolution("'Test'") addSolution("tikzPicture('draw (0, 0) -- (1, 1);')") addSolution("tikzPicture('draw (0, 0) -- (1, 1)')") # line 21 here

printSolutions()

The traceback in Python is

Traceback (most recent call last):
  File "FILE", line 23, in <module>
    printSolutions()
  File "FILE", line 15, in printSolutions
    for solution in solutions: print(eval(solution))
  File "<string>", line 1, in <module>
  File "FILE", line 5, in tikzPicture
    if ";" not in s: raise RuntimeError("No semicolon")
RuntimeError: No semicolon

As you can see, the traceback only point to line 23 (printSolutions), not line 21 which is where you actually add the code.

(also note that in TeX the "most recent call" is the first, not the last)

Python: could it be better?

In fact Python is even worse in this particular example! (it's fixable, see stackoverflow question)

The only reason you don't encounter it in "normal" programming languages is that it's extremely rare for you to use eval/exec.

What could the theoretical Python library improve?

  • On addSolution, store the stack content somewhere, then print it back. That way you can get the "line 21".
  • On printSolutions, when there's the error then print out the user's source code.

The obvious disadvantage is the time/memory/code complexity/maintenance difficulty overhead.

TeX: could it be better?

Both solutions above applies, requires modifications in the exsheets package.

  • It's possible to store the current line number when \begin{solution} is defined along with the verbatim solution text, and print it on execution (somewhere, for example in the log)

  • Without that, it remains possible for the package to print out the solution content before rendering each solution.

    (there's no try...catch or try...except in TeX)

    For the tail-call optimization case, as mentioned in the linked question above, it's possible to look at the output of \tracingmacros=1 to determine the context.

    For this particular package at this particular version, the omitted "user layer" is right above (more recent than) the exsheetsprintsolution layer. So search for the last occurrence of \exsheetsprintsolution before the traceback in the log. In this particular case it is

    \exsheetsprintsolution #1#2->#1#2
    

    #1<-\exsheets_solutions_print_name:Vnn \l__exsheets_tmpa_tl {\GetTranslation {exsheets-solution-name}}{1}

    #2<-\begin {tikzpicture} \draw (0,0) -- (1,1) \end {tikzpicture}

    You can clearly see that #2 is your code.

    Alternatively (reminder: this is specific to this particular package implementation/version) it's possible to print out the content of your environment (already tokenized) by typing I\ExplSyntaxOn\show\l__exsheets_tmpb_tl at the TeX error prompt.

What happened to the newline?

It's another issue with the exsheets package: the content inside the environment is not stored verbatim, rather it's parsed directly, the parse removes the newlines.

You can easily see that exsheets does not store verbatim things correctly:

\begin{solution}
    \verb+%+
\end{solution}

(not related to this question, but you can use \cprotEnv in cprotect package to fix the issue.)

Why is \scantokens (exec-equivalent) so common in TeX?

Basically, in order to parse TeX correctly in general case, you have to execute the code.

So when you want to store the content without executing the code, you must store it in verbatim (roughly "as a string"), then execute it later as a string.

Although in this particular case, it's collected as an argument. That also destroys the line number.

user202729
  • 7,143
  • A better solution would be redefining \exsheetsprintsolution which is part of the "public API" so is less likely to break. – user202729 Dec 22 '21 at 02:27