4

I have a handwritten document which contains graphics, tables, and text paragraphs. I know how to extract graphics from this document. Now I want to know how can I discriminate between text lines and table rows, if I have already segmented the document into lines?

You can see an example of a document image below: in red is the graphic part. That which is framed in blue represents the table row, but there is no ruling line in the table. I want to discriminate between line text and table row using line layout.

enter image description here

jonsca
  • 1,994
  • 3
  • 21
  • 39
user7546
  • 41
  • 1
  • Hi, I would like to make it clearer. Would you like to extract the text line by line, or just remove the table lines while keep the text part as a whole? – lennon310 Jan 13 '14 at 18:30
  • 1
    hi, I want to delimit the table and consider all the remaining part as text. – user7546 Jan 15 '14 at 08:51

1 Answers1

1

You can detect the table from a normal text paragraph using the Hough Transform by which you can differentiate the text from the image. You can also use the bounding box method to detect the table and differentiate from the normal text rows.

jojeck
  • 11,107
  • 6
  • 38
  • 74
Bhavana
  • 11
  • 5