18

There's no doubt that one should always post the Mathematica code text rather than its screenshot when asking questions in this site, but every time I saw someone posting a screenshot, I can't help wondering if it's possible to recognize the code on a picture with Mathematica. Just as an example:

enter image description here

Can I OCR the line of code from the picture?

A direct use of TextRecognize is ineffective.

If it's not possible, what's the threshold here?

xzczd
  • 65,995
  • 9
  • 163
  • 468
  • 2
    Mathematica uses Tesseract; its main advantage is the price, IMO. The line above suffers from not being in any language that Tesseract would recognize, because it applies a lot of heuristics (a/k/a guesswork). Obliqued fonts won't help, either. The killer is the presence of special characters; in your example, there is one for a delayed-evaluation replacement, and there might be two double square brackets for the Part[] extraction. – Felix Kasza Sep 22 '15 at 09:52
  • You might be able to solve your problem by extending the method that I gave in my answer to Applying TextRecognize on alpha-numerical table. – Stephen Luttrell Sep 22 '15 at 10:21
  • 2
    @FelixKasza given that the font is fixed and line orientation is not a problem, it would be very easy to train Tesseract or Ocropy to recognise Mathematica's font. One could get fancy and, if the quality is not great, do some guesswork and take the maximum likelihood version that the parser doesn't complain. – Davidmh Sep 22 '15 at 14:20
  • True. But I see that Murta has already figured out the important part, how to set tesseract options when run through TextRecognize[]: http://mathematica.stackexchange.com/a/31851/12120 – Felix Kasza Sep 22 '15 at 18:53

1 Answers1

22

One can do TextRecognize on screenshots if one increases the dpi (screenshots are usually somewhat like 72dpi - far too low for text recognition). I post this as an answer, because I do not know how to put images in a comment (sorry). In the picture you see increasing accuracy in recognizing the text. I did this in Photoshop, producing images of 300 and 600 dpi and left all the options on "automatic". textrecognition with increasing DPI

mgamer
  • 5,593
  • 18
  • 26
  • 1
    Your answer got my upvote because it is clearly correct, but I am personally too lazy to 'shop a screenshot just so I can OCR it (although I have done that an more to large peices of badly scanned old magazines). – Felix Kasza Sep 22 '15 at 18:47
  • 1
    @Felix Kasza: ... regarding the "badly scanned old magazines", the only thing I can say to this subject is: me, too! – mgamer Sep 22 '15 at 19:36
  • Hmm…Is photoshop necessary here? – xzczd Sep 24 '15 at 02:28
  • @xzczd: Hmmm. I´m not sure. My feeling is - it is not. I tried a lot with adjusting the image and filtering / sharpening before TextRecognize but did not get as good results as using PS. O.K. I´m working a lot with PS - photography is my hobby ;-) – mgamer Sep 25 '15 at 16:32