JUCS - Journal of Universal Computer Science 14(2): 252-265, doi: 10.3217/jucs-014-02-0252
Table-form Extraction with Artefact Removal
expand article infoLuiz Antônio Pereira Neves, João Marques De Carvalho, Jacques Facon, Flávio Bortolozzi
‡ PUCPR, Brazil
Open Access
In this paper we present a novel methodology to recognize the layout structure of handwritten filled table-forms. Recognition methodology includes locating line intersections, correcting wrong intersections produced by what we call artefacts (overlapping data, broken segments and smudges), extracting correct table-form cells and using as little previous table-form knowledge as possible. To improve layout structure recognition, a novel artefact identification and deletion method is also proposed. To evaluate the effectiveness of the methodology, a database composed of 350 handwritten filled table-form images damaged by different types of artefacts was used. Experiments show that the artefact identification method improves performance of the table-forms structure extractor that reached a success rate of 85%.
table-form recognition, table-form extraction, handwritten data, document segmentation