I'm writing a Latex to text converter, i.e. its takes an input of latex code and outputs readable text, which will then be read aloud automatically by a computer. When both parts are connected it's a Latex-to-speech program. An example would look something like this:
"5m and m is a unit, $m\cdot f(x)\cdot sin(90^\circ)=f'(x)$" is converted into -> "5 meter and meter is a unit, m times f of x times sine of 90 degrees equals f dash of x"
I've been working on this project for some time, and right now it's capable of converting nearly every Latex formula, but the code is very ad hoc based, by which I mean when I began the project it was so comprehensive I thought the only way to start was not to overthink everything. Now that I have a larger understanding of what problems the project consists of I want to rewrite some of the fundamental functions. Right now the program doesn't scan and create an abstract syntax tree, like a typical compiler, and I'm contemplating if copying classical compiler methods would be a good approach. I've read on StackExhange that the Latex compiler is merely based on macros, maybe it's smarter to find inspiration in a Latex compiler? What do you think the best approach is?
pylatexencalso https://pylatexenc.readthedocs.io/en/latest/latex2text/, and https://dlmf.nist.gov/LaTeXML/. – Marijn Apr 30 '21 at 19:59