3,109 research outputs found
Segmentation of touching characters in printed document recognition
A new discrimination function is presented for segmenting touching characters based on both pixel and profile projections. A dynamic recursive segmentation algorithm is developed for effectively segmenting touching characters. Contextual information and spell checking are used to correct errors caused by incorrect recognition and segmentation. Based on 12 real documents, a maximum 99.85% and a minimum 99.4% recognition accuracy is achieved.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/31557/1/0000484.pd
Text Line Segmentation of Historical Documents: a Survey
There is a huge amount of historical documents in libraries and in various
National Archives that have not been exploited electronically. Although
automatic reading of complete pages remains, in most cases, a long-term
objective, tasks such as word spotting, text/image alignment, authentication
and extraction of specific fields are in use today. For all these tasks, a
major step is document segmentation into text lines. Because of the low quality
and the complexity of these documents (background noise, artifacts due to
aging, interfering lines),automatic text line segmentation remains an open
research field. The objective of this paper is to present a survey of existing
methods, developed during the last decade, and dedicated to documents of
historical interest.Comment: 25 pages, submitted version, To appear in International Journal on
Document Analysis and Recognition, On line version available at
http://www.springerlink.com/content/k2813176280456k3
Automatic detection of change in address blocks for reply forms processing
In this paper, an automatic method to detect the presence of on-line erasures/scribbles/corrections/over-writing in the address block of various types of subscription and utility payment forms is presented. The proposed approach employs bottom-up segmentation of the address block. Heuristic rules based on structural features are used to automate the detection process. The algorithm is applied on a large dataset of 5,780 real world document forms of 200 dots per inch resolution. The proposed algorithm performs well with an average processing time of 108 milliseconds per document with a detection accuracy of 98.96%
Recommended from our members
Use of colour for hand-filled form analysis and recognition
Colour information in form analysis is currently under utilised. As technology has advanced and computing costs have reduced, the processing of forms in colour has now become practicable. This paper describes a novel colour-based approach to the extraction of filled data from colour form images. Images are first quantised to reduce the colour complexity and data is extracted by examining the colour characteristics of the images. The improved performance of the proposed method has been verified by comparing the processing time, recognition rate, extraction precision and recall rate to that of an equivalent black and white system
RAPID ANALYTICAL VERIFICATION OF HANDWRITTEN ALPHANUMERIC ADDRESS FIELDS
Microsoft, Motorola, Siemens, Hitachi, IAPR, NICI, IUF
This paper presents a combination of fuzzy system and dynamic analytical model to deal with imprecise data derived from feature extraction in handwritten address images which are compared against postulated addresses for address verification. A dynamic buildingÂnumber locator is able to locate and recognise the buildingÂnumber, without knowing exactly where the buildingÂnumber starts in the candidate address line. The overall system achieved a correct sorting rate of 72.9%, 27.1% rejection rate and 0.0% error rate on a blind test set of 450 cursive handwritten addresses.
- …