1

Is there a possible way to combine or translate python language into LaTex language?

I am to create an OCR using python and i would want a file's formatting from borders, to spatial differences, columns and rows, to margins etc. to be exactly as what it was from the original.

I just learned about LaTex a few days ago from a workshop that you can create a document with precise formatting through coding and i was wondering if it is possible if i could use this to be integrated along with my OCR code from python language.

Len
  • 11
  • A similar question was asked 10 years ago, but has its 16th answer posted 6 months back. Though it concerned more about recognizing equations, rather than page formats, etc. But I bet things should have progressed a bit by then ! – Partha D. May 16 '21 at 04:29
  • Thanks for the insight! Do you mind if you could provide me a link to that question? – Len May 16 '21 at 07:56
  • The words "similar question" is a link. That said, seems to me there are two somewhat separate issues. One is the OCR-part and also measuring and identifying the structure of the document in terms of margins, columns, fonts etc. The second is making LaTeX-code corresponding to that structure (the geometry and multicols packages for example can be relevant here). I'm not entirely sure exactly what you're asking about. – Torbjørn T. May 16 '21 at 10:24
  • Thanks! But to further elaborate, our study aims to create a system where a pdf file is an input will be transcribed using ocr and create an interactive/answerable html-based test from it, with the exact same formatting but it can be answerable. LaTeX being one of the ideas we consider as bridging the gaps since it can precisely create documents, in particular to test documents. – Len May 16 '21 at 12:20
  • Ok, but it seems to me that if the end goal is HTML, then going via LaTeX doesn't really make much sense. The primary output of LaTeX is PDF, and while there are ways of making HTML from a .tex-file, e.g. make4ht, you may well have to do a lot of work to make the HTML look like you want anyways. make4ht for one does not make a "carbon copy" of the PDF. I'm no expert though, and I'm probablyy not seeing the whole picture, so these are just some thoughts of mine .. – Torbjørn T. May 16 '21 at 17:13

0 Answers0