Questions tagged [ocr]

Tag for questions on Optical Character Recognition (OCR), including the Mathematica function TextRecognize[].

61 questions
62
votes
3 answers

Applying TextRecognize on alpha-numerical table

From time to time I have to extract data from scanned papers or rasterized PDFs. Imagine a table of the following kind (but much longer): A naive application of TextRecognize on the image fails…
Markus Roellig
  • 7,703
  • 2
  • 29
  • 53
32
votes
1 answer

Can TextRecognize read digits?

While answering this question, I tried TextRecognize to read single digits. But it doesn't recognize a single digit, even though the digits are clearly readable. For example, this is not recognized. This is my code: digits =…
Niki Estner
  • 36,101
  • 3
  • 92
  • 152
7
votes
1 answer

Alphabet recognition

I have a picture with some writings in it. It is written by computer in to me unknown alphabet. Is there a way how to use Mathematica to tell me what alphabet was used or what language? Here are two versions of the same text written in to me unknown…
azerbajdzan
  • 15,863
  • 1
  • 16
  • 48
4
votes
0 answers

Why TextRecognize reads data organized in grid by column and not by row?

Why TextRecognize reads the text by column and not by line in the following scenarios? The lines are perfect, the font is perfect, image quality is perfect so I do not understand why it does not read it normally by line. Also notice that it reads 66…
azerbajdzan
  • 15,863
  • 1
  • 16
  • 48
4
votes
1 answer

Recognizing characters with accent marks

I have written code to recognize some words, but I get errors in the recognition of some characters with accent marks. Is there any solution? i = Import["https://i.stack.imgur.com/f3U9N.jpg"] TextRecognize[i, "SegmentationMode" ->…
LCarvalho
  • 9,233
  • 4
  • 40
  • 96
2
votes
0 answers

cannot recognize accented characters using TextRecognize

Im unable to force TextRecognize understanding Czech characters like "ě, š, č,..." . Adding " .. ; Language -> "Czech" " gives me exactly the same bad result as with "English" or "German". My setup - Mathematica v. 11.2.0.0 plus ThesseractTools…
CJoe
  • 71
  • 7
1
vote
0 answers

TextRecognize giving corrupted results on some images - Bug CASE 4075220

Maybe the question sounds - is this a Mathematica error or what do I understand wrongly? Im having troubles to learn using Textrecognize because Im getting repeatedly errors which when I analyse where those arrise I find that the problem is somwhere…
CJoe
  • 71
  • 7
0
votes
2 answers

Blank output from TextRecognize

Why am I getting blank output from TextRecognize for the following image? Is it because of the size of the image? Neither am I getting output for this code: x = Import["https://i.stack.imgur.com/RbqrD.jpg"]; TextRecognize[x] nor for this one: x =…
0
votes
0 answers

can I train the Textrecognize (or the Tesseract behind) for real typewriter letters?

Can I improve result of the TextRecognize by training the machine which is behind? Id like to work with a lot of scanned texts from several (real) typewriters. It seems to me that TextRecognize gives much better results for computer fonts than for…
CJoe
  • 71
  • 7